-
-
Notifications
You must be signed in to change notification settings - Fork 23.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Top Languages Card not working properly #136
Comments
Same with my profile. github-readme-stats/top-langs not showing all languages as well as percentages. But github-profile-languages is more appropriate. |
Sure, but I think the GitHub GraphQl API just takes the raw LOC count from all your repos and does the analysis from that. Not 100% sure tho |
Ok, I was mistaken, it should just fetch the top language of the repo. @anuraghazra WDYT about this? github-readme-stats/src/fetchTopLanguages.js Lines 5 to 25 in dc3e9a5
|
Working on my side, which browser you are using? |
Chrome. But below code is showing query {
user(login: "rjoydip") {
repositories(isFork: false, first: 100) {
nodes {
languages(first: 1) {
edges {
size
node {
color
name
}
}
}
}
}
}
} |
Yup it should fetch the correct langs. |
Mine is also weird 😕 @anuraghazra
|
NOTE: Consider the 100 max repos & also it get's the totalSize (in bytes) to calculate how many bytes you have written with the language. |
Is that how we want it to be? Is there not a better implementation?
|
That's how github calculates and it's all fetched from github's api so no way the data is wrong, maybe the data processing is wrong from my side. have to do some experiments. |
I'll look into this tomorrow. |
Hi @rjoydip yes, but as you can see "edges": [
{
"size": 196,
"node": {
"color": "#dea584",
"name": "Rust"
}
}
] There is only one rust lang in those 100 results, and the size is 196bytes it's i think this is why it's not showing |
Maybe if i change the gql query to fetch 5 langs from a certain repo then it would be better because for now i'm just only selecting one language from each repo. user(login: "rjoydip") {
repositories(isFork: false, first: 100) {
nodes {
languages(first: 5) {
edges {
size
node {
color
name
}
}
}
}
}
} |
@anuraghazra Yes, I saw the same thing. It'll be better to make dynamic |
Not dynamic, making it max 5 or 10 would do the job, a repo can't have too much languages anyways. and isFork should always be false, don't want to count forked repos. for example if anyone forked reactjs then they would have lot of js code |
I am of the opinion, that we should count forks too, it's something, that GitHub also does, and forks exist also because people have projects they make on their own. I am not sure if GitHub provides this, I am not exactly an expert on their v4 API, but the extensions that provide the same solution must be querying it somehow, I'll look into that. |
Just a link for some info on another solution: |
FYI... {
user(login: "rjoydip") {
repositories(isFork: false, first: 100, orderBy: {field: UPDATED_AT, direction: DESC}) {
nodes {
name
updatedAt
languages(first: 5, orderBy: {field: SIZE, direction: DESC}) {
nodes {
name
}
}
primaryLanguage {
name
}
}
}
}
} |
Useful, thanks, for this, I think it would be enough to not use the languages, just the primary one. / cc: @anuraghazra @rjoydip {
user(login: "rjoydip") {
repositories(isFork: false, first: 100, orderBy: {field: UPDATED_AT, direction: DESC}) {
nodes {
primaryLanguage {
name
}
}
}
}
} That gives out something like this: {
"data": {
"user": {
"repositories": {
"nodes": [
{
"primaryLanguage": {
"name": "TypeScript"
}
},
{
"primaryLanguage": {
"name": "TypeScript"
}
},
{
"primaryLanguage": {
"name": "Java"
}
},
{
"primaryLanguage": null
}
]
} |
I think this is would be good for "most used language widget" as it is currently an approximation of many repos. for example I could one repo that is just an express app just serving mostly HTML, but it would say 100% typescript. |
Does this mean, that HTML isn't considered a language in this analysis? I am a bit confused. Can you give me an example repo? |
I don't think we can effectively do language analysis, for example take a scenario if someone uploaded node_modules to their github then their javascript would be 100% no matter what. same as #153 |
True, but nobody should ever do that (upload their node_modules), if they do, they cannot be then angry at our code, which considers Industry best practices and it also affects GitHub's own analysis. |
Yup, but that's not my point, there are lot of scenarios where we cannot evaluate code correctly and there is no perfect way to do that, lets just take an example of my website's github repo which has a https://github.com/anuraghazra/anuraghazra.github.io/tree/master |
I also came here because I saw discrepancies between this tool and my Chrome extension-generated pie chart (linked above). As an example, this tool says my top language is JavaScript with 46.21% and Lua is second with 29.29%, while the pie chart says I have 5 JavaScript repos and 15 Lua repos. However, if I do a search for JavaScript repos, I only get 1. Not sure what to make of that; presumably this tool counts LOC while the pie chart counts top language per repo, but perhaps the pie chart counts forks too, since it comes up with 5 and not 1? By the way, not sure if it's relevant, but organization profile pages (like https://github.com/github) actually list the top 5 languages in the org's repos (without bars or percentages or anything fancy). It looks like those are just top languages, not LOC. I might be mistaken though. |
And i also like the suggestion of @stemount
I think the "Top Languages" labeling is misleading, it should be "Most used languages" |
@NikhilCodes i don't no whats wrong with your profile but i've checked with other user's stats with the fix i'm working on and they are all fine expect yours. Btw i've checked the graphql request and seems like you do have a very very very large python repo, and this repo is so huge in bytes its straight up kicking your dart stats, so i think the stats are totally fine. {
"nameWithOwner": "NikhilCodes/VirtualBLU",
"isFork": false,
"languages": {
"edges": [
{
"size": 78537442,
"node": {
"color": "#3572A5",
"name": "Python"
}
}
]
}
}, THE UPDATED GQL QUERY LOOKS LIKE THIS user(login: "NikhilCodes") {
repositories(ownerAffiliations: OWNER, isFork: false, first: 100) {
nodes {
nameWithOwner
isFork
languages(first: 10, orderBy: {field: SIZE, direction: DESC}) {
edges {
size
node {
color
name
}
}
}
}
}
} |
Much better! Thank you. However, I believe there's something wrong with your "tie-break" algorithm, so to speak. The Chrome extension linked earlier lists these as my top languages:
And then a bunch of languages, including Java, with 1 repo each. As you'll notice, your new code lists Assembly as my third repo and then C below that. It seems that it's simply ignoring my HTML and CSS repos, because I have as many of them as I have assembly repos! That would also explain why Java is on the list at all even though I only have one Java repo – it's ignoring my 4 Ruby repos because I already have 4 C repos on the list. So, to sum it up, I think your patched code is getting there, but if there are several languages with the same repo count, it only displays one of them and ignores the rest. |
@tobiasvl It is not about how many repos you have, you could have 100 Js repos with 10bytes of code and you can have 1 Python repo with 20000bytes of code, Python would be at top in this case.
As you can see i changed the gql query to fetch 10 languages in every repo & i'm calculating all of them. |
Hmm, OK, thanks. If it's intentional then I'm definitely fine with this. I don't want HTML and CSS on my list anyway 😅 |
Also @tobiasvl
Actually there is no special algorithm in play here, i'm just sorting/manipulating the data coming from Github's API and picking the Top languages which have the most size. |
I see, thanks. Ship it! |
Oh my godh. because i just checked their source code and they are calculating "How MANY languages a user has in their profile" & github-readme-stats is calculating "MOST used languages in user's profile" And they have a package called gh-polygot which is doing this :- |
@VictorNS69 |
Yes, I think it's pertinent to point out that strangely, the percentages seem to be within the top 5. You don't have 1.69% TeX in all your repos, but it makes up 1.69% of all the code within your top 5 languages. (Unless I've misunderstood.) |
He has one repo with a good amount of TeX https://github.com/VictorNS69/Apuntes-Ciber |
I my Profile Not Show Top Language, All are Empty |
@abdullasirajudeen because you don't have anything, you have 5 repos 4 of them are forks which won't be counted and one is your readme repo which does not have any code. |
@anuraghazra I think you mean @abdullasirajudeen? (I have lots of repos) |
private repository not counted here. |
oh yeahh, sorry for the wrong mention. |
Mine is also working incorrectly, |
Is there any stats for the languages you have contributed in? (Company repositories) |
Same here. I haven't used python in very long but it still shows up, and doesn't show JS and TS which i use mostly |
same here |
Describe the bug
I have quite a few repos with the top language being javascript. However, the top languages card doesn't show javascript at all in the list.
Expected behavior
It should show js as one of the top languages.
Screenshots / Live demo link
The text was updated successfully, but these errors were encountered: