sshleifer Follow

Sam Shleifer sshleifer

Research Engineer at Huggingface. Formerly Machine Learning Research @kensho and @Stanford

Pro

Block or report user

Report or block sshleifer

Contact Support about this user’s behavior.

Learn more about reporting abuse

Organizations

Research Engineer at Huggingface. Formerly Machine Learning Research @kensho and @Stanford

Block or report user

Report or block sshleifer

Contact Support about this user’s behavior.

Learn more about reporting abuse

Pinned

charactr

Visualize your iMessage conversations

Python 4
object_detection_kitti

TF Object Detection on Kitti Data

Python 23 16
backtranslated-imdb

Backtranslations of IMDB movie reviews for Data Augmentation Purposes

7 2

My Favorite apps and workflow stuff ...

1

### Mac

2

2. [Spectacle] (https://www.spectacleapp.com/)

3

1. [Rescuetime] (https://www.rescuetime.com/dashboard)

4

1. [Self Control] (https://selfcontrolapp.com/)

5

1. [iTerm2] (https://www.iterm2.com/)

Graph-WaveNet

Forked from nnzhan/Graph-WaveNet

Modifications to Graph Wavenet

Python 19 3
mixmatch

Pytorch implementation of https://arxiv.org/abs/1905.02249v1

Jupyter Notebook 9

1,014 contributions in the last year

Contribution activity

May 2020

sshleifer/TextBrewer Python May 17

Created a pull request in huggingface/transformers that received 10 comments

[Model Cards] Add 1010 model cards for Helsinki-NLP

Generated using tools in convert_marian_to_pytorch.py. Takes the bottom most entry in each opus-mt-train/models/*/README.md. This assumes that the…

+15,357 −0 • 10 comments

[test_pipelines] Mark tests > 10s @slow, small speedups May 17
[cleanup] test_tokenization_common.py May 15
[MarianTokenizer] implement save_vocabulary and other common methods May 15
[Marian Fixes] prevent predicting pad_token_id before softmax, support language codes, name multilingual models May 11
[Marian] Fix typo in docstring May 11
useless chg May 10
[CI] remove run_tests_torch_and_tf May 10
[tests] make pipelines tests faster with smaller models May 8
Pipelines: use tokenizer.max_len May 8
[examples/summarization] run_train_tiny.sh: remove unused kwarg May 7
Tokenizer.batch_decode convenience method May 5
[Marian] documentation and AutoModel support May 5
[Fix #3963] GPT2 FP16 May 1
Fix gpt2 fp16 May 1

Longformer May 17
[cleanup] test_tokenization_common.py May 15
[MbartTokenizer] fix loading from pretrained checkpoints May 15
Distributed eval: SequentialDistributedSampler + gather all results May 14
Fix nn.DataParallel compatibility in PyTorch 1.5 May 14
[Marian Fixes] prevent predicting pad_token_id before softmax, support language codes, name multilingual models May 13
Fix BART tests on GPU May 12
[Docs, Notebook] Include generation pipeline May 11
[CI] remove run_tests_torch_and_tf May 11
[docs] fix typo May 10
[README] Corrected some grammatical mistakes May 10
[tests] make pipelines tests faster with smaller models May 8
Pipelines: use tokenizer.max_len May 8
[Marian] documentation and AutoModel support May 5
Fix overwrite_cache behaviour for pytorch lightning examples May 5
Reformer May 4

Created an issue in Helsinki-NLP/OPUS-MT-train that received 6 comments

[Language Codes] How are models named?

For example, In, cmn+cn+yue+ze_zh+zh_cn+zh_CN+zh_HK+zh_tw+zh_TW+zh_yue+zhs+zht+zh-de Is there a table or some other source for what zh_HK, zh_yue,…

6 comments

Multilingual preprocessing May 13
Are posted test sets preprocessed? May 12
[ga-en] Broken Link to test.txt May 8
models/ru-fr/README.md May 7

Is the code for boss.dev open source? May 7

missing keys for some low-resource language pairs May 6

3 contributions in private repositories May 5 – May 12

You can’t perform that action at this time.