Skip to content
master
Go to file
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 

README.md

Flash

Fast Keyword extraction using Aho–Corasick algorithm and Tries.

Flash is a Golang reimplementation of Flashtext,

This is meant to be used when you have a large number of words that you want to:

  • extract from text
  • search and replace

Flash is meant as a replacement for Regex, which in such cases can be extremely slow.

Usage

import "github.com/dav009/flash"

words := flash.NewKeywords()
words.Add("New York")
words.Add("Hello")
words.Add("Tokyo")
foundKeywords := words.Extract("New York and Tokyo are Cities")
fmt.Println(foundKeywords)
// [New York, Tokyo]

Benchmarks

As a reference using go-flash with 10K keywords in a 1000 sentence text, took 7.3ms, while using regexes took 1minute 37s.

Sentences Keywords String.Contains Regex Go-Flash
1000 10K 1.0035s 1min 37s 2.72ms

Warning

This is a toy-project for me to get more familiar with Golang Please be-aware of potential issues.

About

Golang Keyword extraction/replacement Datastructure using Tries instead of regexes

Topics

Resources

Releases

No releases published

Packages

No packages published

Languages

You can’t perform that action at this time.