gpt

Recently HF trainer was extended to support full fp16 eval via --fp16_full_eval. I'd have expected it to be either equal or faster than eval with fp32 model, but surprisingly I have noticed a 25% slowdown when using it.

This may or may not impact deepspeed as well, which also runs eval in fp16, but we can't compare it to a baseline, since it only runs fp16.

I wonder if someone would like t

The Split class accepts SplitDelimiterBehavior which is really useful. The Punctuation however always uses SplitDelimiterBehavior::Isolated (and Whitespace on the other hand behaves like SplitDelimiterBehavior::Removed).

impl PreTokenizer for Punctuation {
    fn pre_tokenize(&self, pretokenized: &mut PreTokenizedString) -> Result<()> {
        pretokenized.split(|_, s| s.spl

I'm playing around with this wonderful code but I'm running into a curious issue when I try to train the model with my own data.

I replicated the personachat_self_original.json file structure and added my own data. I deleted dataset_cache_OpenAIGPTTokenizer file but when I try to train, I get this error:

INFO:train.py:Pad inputs and convert to Tensor
Traceback (most recent call last)

gpt

Here are 106 public repositories matching this topic...

huggingface / transformers

[trainer] figuring out why eval with `--fp16_full_eval` is 25% slower

DeBERTa Fast Tokenizer

[Wav2Vec2] Improve SpecAugment function by converting numpy based function to pytorch based function

pbatard / rufus

huggingface / tokenizers

Add SplitDelimiterBehavior to Punctuation constructor

EleutherAI / gpt-neo

dbiir / UER-py

huggingface / transfer-learning-conv-ai

RuntimeError: shape '[-1, 2, 34]' is invalid for input of size 61710

thu-coai / CDial-GPT

systemd / mkosi

guillaume-be / rust-bert

bradfitz / embiggen-disk

MorvanZhou / NLP-Tutorials

ValdikSS / Super-UEFIinSecureBoot-Disk

limine-bootloader / limine

Novetta / adaptnlp

lonePatient / awesome-pretrained-chinese-nlp-models

akanyaani / gpt-2-tensorflow2.0

teddykoker / image-gpt

jaanauati / react-dfp

will-thompson-k / deeplearning-nlp-models

IBM / TabFormer

jhermsmeier / node-disk

pampanic / pam_panic

ethanmad / chromeos-resize

luni64 / TeensyTimerTool

itoffshore / alpine-linux-scripts

JRC1995 / Chatbot

Mexit / MultiOS-USB

eBayClassifiedsGroup / react-advertising

manilarome / A-Personal-Arch-Installation-Guide

amazon-research / transformers-data-augmentation

Improve this page

Add this topic to your repo