Skip to content
master
Go to file
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
bin
 
 
lib
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

Farsi Normalizer CircleCI

FarsiProcessor is a ruby gem to normalize and stem Persian/Farsi text

Normalization is defined as:

Stemming is defined as removing these suffixes (+ suffixes of plural form)

Installation

Add this line to your application's Gemfile:

gem "farsi_processor"

And then execute:

$ bundle

Or install it yourself as:

$ gem install farsi_processor

Usage

require 'farsi_processor'

[1] pry(main)> FarsiProcessor.process("ك")
=> "ک"

[2] pry(main)> FarsiProcessor.process("کتاب‌ ها")
=> "کتاب"

# it supports only and except options
[3] pry(main)> FarsiProcessor.process("ك ي", only: ["ك"])
=> "ک ي"

[4] pry(main)> FarsiProcessor.process("ك ي", except: ["ك"])
=> "ك ی"

[5] pry(main)> FarsiProcessor.process('دخترهای', except: ['های'])
=> "دختره"

# you can choose to just normalize or stem a word,
# they also support an only and except option
[6] pry(main)> FarsiProcessor.normalize("ك")
=> "ک"

[7] pry(main)> FarsiProcessor.stem("کتاب‌ ها")
=> "کتاب"

Questions or Problems?

If you have any issues with farsi_processor which you cannot find the solution, please add an issue on GitHub or fork the project and send a pull request.

License

The gem is available as open source under the terms of the MIT License.

You can’t perform that action at this time.