Create your own GitHub profile
Sign up for your own profile on GitHub, the best place to host code, manage projects, and build software alongside 50 million developers.
Sign up
Popular repositories
734 contributions in the last year
Contribution activity
May 2020
Created a pull request in huggingface/nlp that received 8 comments
[Datasets ToDo-List] add datasets
Description This PR acts as a dashboard to see which datasets are added to the library and work. Cicle-ci should always be green so that we can be …
+9,794
−8,886
•
8
comments
- Update Overview.ipynb
- remove webis
- [Newsroom] add newsroom
- [Tests] Local => aws
- make style
- Qa4mre - add dataset
- [Clean-up] remove under construction datastes
- fix reddit tifu dummy data
- [Reclor] fix reclor
- convert can use manual dir as second argument
- [New structure on AWS] Adapt paths
- Add trivia_q
- [Manual downloads] add logic proposal for manual downloads and add wikihow
- [Reddit] add reddit
- [Cmrc 2018] fix cmrc2018
- [Csv] add tests for csv dataset script
- [WIP] add wmt14
- [Paracrawl] add paracrawl
- Add Flores
- [Load => load_dataset] change naming
- [TedHrLr] add left dummy data
- [Datasets] add ted_hrlr
- [Convert] add new pattern
- [Tests] skip beam dataset tests for now
- pin flake 8
- Some pull requests not shown.
- [Docs, Notebook] Include generation pipeline
- Reformer enwik8 - Model card
- [Reformer] Add Enwiki8 Reformer Model - Adapt convert script
- [Jax] top_k_top_p
- [Pipeline, Generation] tf generation pipeline bug
- [WIP] [Benchmark] Memory benchmark utils
- [Reformer] Doctsring: fix examples again
- [Reformer] Fix example and error message
- [Reformer] fix docstring
- [Roberta] fix hard wired pad token id
- [Reformer] Move model card to google model
- Xsum, require manual download of some files
- Qa4mre - add dataset
- add writer_batch_size to GeneratorBasedBuilder
- Webis tl-dr
- [Manual downloads] add logic proposal for manual downloads and add wikihow
- Add per type scores in seqeval metric
- [Cmrc 2018] fix cmrc2018
- Replace checksums files by Dataset infos json
- [Csv] add tests for csv dataset script
- Cleanup notebooks and various fixes
- Add wiki40b
- [WIP] add wmt14
- Add download gg drive
- Add boolq
- Add nbytes + nexamples check
- adding RACE, QASC, Super_glue and Tiny_shakespear datasets
- Metrics - refactoring, adding support for download and distributed metrics
- [Datasets ToDo-List] add datasets
- Fix tests
- Better cached path
- Update to simplify some datasets conversion
- [Features] Typo in generate_from_dict
- Update remote checksums instead of overwrite
- Big cleanup/refactoring for clean serialization
- add metrics which require download files from github
- Some pull request reviews not shown.
- Fix nn.DataParallel compatibility in PyTorch 1.5
- Longformer
- Fix for #3846
- [Marian Fixes] prevent predicting pad_token_id before softmax, support language codes, name multilingual models
- [tests] make pipelines tests faster with smaller models
- [Pipeline, Generation] tf generation pipeline bug
- [Model Cards] Add 1010 model cards for Helsinki-NLP
- [Marian] documentation and AutoModel support
- Reformer
- [Fix #3963] GPT2 FP16
Created an issue in huggingface/nlp that received 3 comments
[Checksums] Error for some datasets
The checksums command works very nicely for squad. But for crime_and_punish and xnli,
the same bug happens:
When running:
python nlp-cli nlp-cli te…
3
comments