Create your own GitHub profile
Sign up for your own profile on GitHub, the best place to host code, manage projects, and build software alongside 50 million developers.
Sign up-
Huggingface
- New York
- Sign in to view email
- https://twitter.com/sam_shleifer
Pinned
1,014 contributions in the last year
Contribution activity
May 2020
- sshleifer/TextBrewer Python
Created a pull request in huggingface/transformers that received 10 comments
[Model Cards] Add 1010 model cards for Helsinki-NLP
Generated using tools in convert_marian_to_pytorch.py.
Takes the bottom most entry in each opus-mt-train/models/*/README.md. This assumes that the…
+15,357
−0
•
10
comments
- [test_pipelines] Mark tests > 10s @slow, small speedups
- [cleanup] test_tokenization_common.py
- [MarianTokenizer] implement save_vocabulary and other common methods
- [Marian Fixes] prevent predicting pad_token_id before softmax, support language codes, name multilingual models
- [Marian] Fix typo in docstring
- useless chg
- [CI] remove run_tests_torch_and_tf
- [tests] make pipelines tests faster with smaller models
- Pipelines: use tokenizer.max_len
- [examples/summarization] run_train_tiny.sh: remove unused kwarg
- Tokenizer.batch_decode convenience method
- [Marian] documentation and AutoModel support
- [Fix #3963] GPT2 FP16
- Fix gpt2 fp16
- Longformer
- [cleanup] test_tokenization_common.py
- [MbartTokenizer] fix loading from pretrained checkpoints
- Distributed eval: SequentialDistributedSampler + gather all results
- Fix nn.DataParallel compatibility in PyTorch 1.5
- [Marian Fixes] prevent predicting pad_token_id before softmax, support language codes, name multilingual models
- Fix BART tests on GPU
- [Docs, Notebook] Include generation pipeline
- [CI] remove run_tests_torch_and_tf
- [docs] fix typo
- [README] Corrected some grammatical mistakes
- [tests] make pipelines tests faster with smaller models
- Pipelines: use tokenizer.max_len
- [Marian] documentation and AutoModel support
- Fix overwrite_cache behaviour for pytorch lightning examples
- Reformer
Created an issue in Helsinki-NLP/OPUS-MT-train that received 6 comments
[Language Codes] How are models named?
For example,
In,
cmn+cn+yue+ze_zh+zh_cn+zh_CN+zh_HK+zh_tw+zh_TW+zh_yue+zhs+zht+zh-de
Is there a table or some other source for what zh_HK, zh_yue,…
6
comments
- [PretrainedTokenizer] is <unk> a special token?
- [docs] AutoModelWithLMHead(model_name, **kwargs)
- [Bart/Marian] ignore output_attentions when invoked through AutoModel
- [pipelines] Failing @slow test for TF Summarization
- [infra] make a tiny "distilroberta-base" to speed up test_pipelines.py and test_examples.py
- MarianMTModel: Runtime Errors
- [examples] text_classification/run_pl.sh error
- [Proposal] Small HfAPI Cleanup
- [Marian] Key-Error for some languages
- [Marian] Multilingual models require language codes
- [Marian] @-@ symbol causes strange generations
- [Marian] Readme parser defaults to porting oldest model
3
contributions
in private repositories
May 5 – May 12