Create your own GitHub profile
Sign up for your own profile on GitHub, the best place to host code, manage projects, and build software alongside 50 million developers.
Sign up
Pinned
1,325 contributions in the last year
Contribution activity
May 2020
Created a pull request in huggingface/nlp that received 6 comments
Update to simplify some datasets conversion
This PR updates the encoding of Values like integers, boolean and float to use python casting and avoid having to cast in the dataset scripts, as m…
+11
−1
•
6
comments
- [Csv] add tests for csv dataset script
- Replace checksums files by Dataset infos json
- [WIP] add wmt14
- Add download gg drive
- add tests
- [Datasets] add ted_hrlr
- Add nbytes + nexamples check
- fix overflow check
- Fix arrow writer for big datasets using writer_batch_size
- Beam datasets
- fix cache dir in builder tests
- Metrics - refactoring, adding support for download and distributed metrics
- Better cached path
- [Features] Typo in generate_from_dict
- [Command Convert] remove tensorflow import
- [PyArrow Feature] fix py arrow bool
- [Features] Strip str key before dict look-up
- [Load module] allow kwargs into load module
- [Tests] add slow tests
- add metrics which require download files from github
- Fix map caching notebooks
- [Circle ci] Install a virtual env before running tests
- [TF 2.2 compat] use tf.VariableAggregation.ONLY_FIRST_REPLICA
- [CI] remove run_tests_torch_and_tf
- [tests] make torch pipelines tests faster with smaller models
- Simplify cache vars and allow for TRANSFORMERS_CACHE env
- [Marian] documentation and AutoModel support
- [Model Cards] Add 1010 model cards for Helsinki-NLP
- [Fix #3963] GPT2 FP16
- [Pipeline, Generation] tf generation pipeline bug
- TF version of the trainer