Highlights
- Arctic Code Vault Contributor
- Pro
Create your own GitHub profile
Sign up for your own profile on GitHub, the best place to host code, manage projects, and build software alongside 50 million developers.
Sign up
Popular repositories
1,335 contributions in the last year
Contribution activity
October 1, 2020
patrickvonplaten has no activity
yet for this period.
September 2020
- patrickvonplaten/datasets-1 Python
Created a pull request in huggingface/transformers that received 11 comments
[Longformer, Bert, Roberta, ...] Fix multi gpu training
Fixes #6256. Issue #6256 shows that distributed training is not possible when the model has layers that are not used at all. Bert, Roberta and Lon…
+179
−49
•
11
comments
- [Seq2Seq] Fix a couple of bugs and clean examples
- [RAG] Model cards - clean cards
- [Rag] fix rag retriever save_pretrained method
- [Rag] Fix wrong usage of `num_beams` and `bos_token_id` in Rag Sequence generation
- [RAG] Add missing doc and attention_mask to rag
- [RAG] Add `attention_mask` to RAG generate
- [RAG] PR to save status of previous RAG code
- [EncoderDecoderModel] fix indentation error
- [WIP RAG] Finalize RAG parallel
- [BertGeneration] Clean naming
- [BertGeneration, Docs] Fix another old name in docs
- [BertGeneration] Correct Doc Title
- [Longformer] Fix longformer documentation
- [WIP] Refactoring the generate() function
- [LXMERT] Fix tests on gpu
- Torchscript benchmark measure
- [Docs, Examples] Fix QA example for PT
- [Electra] fix warning for position ids
- Create README.md
- [EncoderDecoder] Add xlm-roberta to encoder decoder
- [WIP, TF] replace keras dense by keras.layers.DenseEinsum
- [Seq2Seq] Fix a couple of bugs and clean examples
- Add DeBERTa model
- Adding gradient checkpointing to GPT2
- SqueezeBERT architecture
- Custom TF weights loading
- Make T5 compatible with ONNX
- [WIP] ProphetNet
- [RAG] Clean Rag readme in examples
- [T5] allow config.decoder_layers to control decoder size
- Replaced torch.load for loading the pretrained vocab of TransformerXL tokenizer to pickle.load
- [RAG] Remove dependency on `examples/seq2seq` from rag
- Enable pegasus fp16 by clamping large activations
- Remove unhelpful bart warning
- [RAG] Fix retrieval offset in RAG's HfIndex and better integration tests
- Make PyTorch model files independent from each other
- Clean RAG docs and template docs
- [Benchmarks] Change all args to from `no_...` to their positive form
- [Longformer, Bert, Roberta, ...] Fix multi gpu training
- [Bug fix] Fixed target_mapping preparation for XLNet (Pytorch)
- Add tests and fix various bugs in ModelOutput
- fix deprecation warnings
- Add "Leveraging Pretrained Checkpoints for Generation" Seq2Seq models.
- [from_pretrained] Allow tokenizer_type ≠ model_type
- [generation] consistently add eos tokens
- [gen utils] missing else case
- Some pull request reviews not shown.
Created an issue in huggingface/transformers that received 1 comment
Missing keys when loading weights in TF are not useful
This concerns all TF models If one loads weights of a tensorflow model these lines are run: transformers/src/transformers/modeling_tf_utils.py …
1
comment