Quick Start
Prerequisites: Yarn, node and cmake
note : windows users also need Build Tools for Visual Studio package
-
clone the repo using the
--recursivearg to fetch google/sentencepiece submodule -
Run
yarnto fetch node packages. -
Run
yarn buildto build google/sentencepiece and the node binding -
step outside the directory:
>> cd .. -
run node:
>> node -
require node-sentencepiece package
(node) var sp = require('./node-sentencepiece') -
instanciate a processor
(node) var proc = new sp.Processor() -
load a model
(node) proc.loadModel('/path/to/model/m.model') -
use the processor to get tokens
(node) proc.encode('Never gonna give you up, Never gonna let you down')returns:
[ '▁', 'N', 'ever', '▁gonna', '▁give', '▁you', '▁up', ',', '▁', 'N', 'ever', '▁gonna', '▁let', '▁you', '▁down' ] -
you can get back the original input text from token by using the decode method
(node) var inputText = 'Feel the rain on your skin No one else can feel it for you'(node) var proc = new sp.Processor()(node) proc.loadModel('/path/to/model/m.model')(node) var pieces = proc.encode(inputText)(node) var outputText = proc.decode(pieces, modelPath)(node) inputText === outpuTextreturns:
true