Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document raw.githubusercontent.com #8031

Closed
jsoref opened this issue Jul 8, 2021 · 12 comments
Closed

Document raw.githubusercontent.com #8031

jsoref opened this issue Jul 8, 2021 · 12 comments
Labels
content Problems or updates in the docs content on docs.github.com.

Comments

@jsoref
Copy link
Contributor

jsoref commented Jul 8, 2021

What article on docs.github.com is affected?

Any page that talks about limits and doesn't indicate if raw.githubusercontent.com is covered/uncovered

What part(s) of the article would you like to see updated?

Any page that talks about limits.

Additional information

The domain is mentioned occasionally, here's code that mentions it:
https://github.com/github/docs/search?q=raw.githubusercontent.com+-path%3Atranslations&type=code

Oddly, there are way more hits if you don't exclude translations. Based on my experience w/ translations and a tool to check translations, this means that the translations are buggy and there is missing quality control on them.

Here's a visible reference:
https://docs.github.com/en/github/creating-cloning-and-archiving-repositories/creating-a-repository-on-github/limits-for-viewing-content-and-diffs-in-a-repository#text-limits

Text files over 5 MB are only available through their raw URLs, which are served through raw.githubusercontent.com; for example, https://raw.githubusercontent.com/octocat/Spoon-Knife/master/index.html. Click the Raw button to get the raw URL for a file.

Here is @canuckjacq indicating that they have no idea what the correct answer is:
https://github.community/t/raw-githubusercontent-com-rate-limit/142444/6
https://docs.github.com/en/free-pro-team@latest/developers/apps/rate-limits-for-github-apps#normal-user-to-server-rate-limits

This has come up repeatedly.

@jsoref jsoref added the content Problems or updates in the docs content on docs.github.com. label Jul 8, 2021
@docubot docubot closed this as completed Jul 8, 2021
@docubot docubot added invalid and removed content Problems or updates in the docs content on docs.github.com. labels Jul 8, 2021
@docubot
Copy link
Collaborator

docubot commented Jul 8, 2021

This issue appears to have been opened accidentally. I'm going to close it now, but feel free to open a new issue or ask any questions in discussions!

@github-actions github-actions bot added the triage Do not begin working on this issue until triaged by the team. label Jul 8, 2021
@jsoref
Copy link
Contributor Author

jsoref commented Jul 8, 2021

@janiceilene please undo your bot.

@ramyaparimi ramyaparimi added content Problems or updates in the docs content on docs.github.com. and removed invalid triage Do not begin working on this issue until triaged by the team. labels Jul 8, 2021
@ramyaparimi
Copy link
Contributor

@jsoref sorry about that, I am triaging this for the team to take a look 👀

@ramyaparimi ramyaparimi reopened this Jul 8, 2021
@ramyaparimi ramyaparimi moved this from Done to Content review needed in Docs open source board Jul 8, 2021
@github-actions github-actions bot added the triage Do not begin working on this issue until triaged by the team. label Jul 8, 2021
@ramyaparimi ramyaparimi removed the triage Do not begin working on this issue until triaged by the team. label Jul 8, 2021
@89537065506

This comment has been minimized.

@WWE-Network

This comment has been minimized.

@stevecat
Copy link
Member

Hi there @jsoref! Thanks for opening this

I'm speaking with our engineering team about this, both to confirm what limits there are and also to confirm if it's something we can document. I know that this endpoint is unauthenticated, so supplying a token won't make a difference. It might take a while but I'll follow up here once I have more to share.

The difference in the number of search results, with and without -path:translations, is interesting too! From what I can see, there are a few articles (such as the REST API reference for code scanning) that have recently been updated to remove a mention of raw.githubusercontent.com but where the translated version has not yet caught up. You can find more information about the translations in our "CONTRIBUTING.md".

@stevecat
Copy link
Member

Hi again @jsoref! I return with information to share 😄

I spoke with our engineering team and learnt that there's a limit of 5000 requests per hour per IP address. Additionally, due to internal routing and caching, that 5000 figure isn't going to be exact. We may accept more but it's sometimes possible that we'll accept less too.

As was pointed out to me, if you're at risk of hitting this limit, then you're probably doing something wrong and there's a better way to obtain or even store the file.

I'm going to work on documenting this limit but because I'll need to work closely with the engineering team to ensure everything is covered, I'm going to move this to our internal documentation repository. I'll close this issue but I'll mention you here again when those changes ship!

Thanks for taking the time requesting this and improving our documentation!

@zulh-civo
Copy link

I was looking for "rate limits for GitHub raw content e.g. https://raw.githubusercontent.com/github/fetch/master/README.md" and I found this issue.

@stevecat - have you updated the official docs? Is "5000 requests per hour per IP address" still valid?

@stevecat
Copy link
Member

Hey @zulh-civo 👋 Our docs haven't been updated with this information yet, I hope to get to that shortly. The 5000 requests per hour is still valid, with all the caveats I described in my comment above.

@jsoref
Copy link
Contributor Author

jsoref commented Oct 27, 2021

@stevecat: Thanks for the update. Fwiw, the translation count discrepancy remains (3 months later).

For the curious, that content appears to have moved here:
https://github.com/github/docs/blob/2289ca70c7ec7ec1b81eafdb333cd0157fafaa08/translations/README.md

(And yes, I've read that equivalent content a couple of times, it's why I haven't actively filed bugs or PRs against it.)

(When I worked at a large company, if something stayed out of sync for this long, I'd walk over to someone involved and ask them about it.)

@jessuppi
Copy link

Hey @stevecat @ramyaparimi

Is there any update on this yet after 1+ years? Specifically, are you able to confirm:

  1. Are requests for "raw" files via the GitHub API i.e. would tokens in wget requests have any benefit?
  2. Would high number of request be better routed via the api.github... URL instead of "raw" URL and then include personal access tokens in the requests, for better stability?
  3. What is the current rate limit for "raw" files? My project SlickStack has noticed random timeouts or failed wget from GitHub over the past few years, and I'm wondering if this could be why.

A related thread on StackOverflow:
https://stackoverflow.com/questions/66522261/github-limit-on-public-repositories

@jsoref
Copy link
Contributor Author

jsoref commented Dec 30, 2022

There was at least one raw outage this month https://www.githubstatus.com/incidents/d5158f5wh1ly?utm_ts=1671136320

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
content Problems or updates in the docs content on docs.github.com.
Development

No branches or pull requests

10 participants