compromise下載 - compromise源代碼下載

compromise

其他源碼

14.14.3

下載

妥協

適度的自然語言處理

npm install compromise

_{Spencer Kelly和許多貢獻者}

_{法語•德語•意大利語•西班牙語}

你不覺得這很奇怪，

_{文字製作多麼簡單，}

_↬ᔐᖜ實際解析和使用有多困難？

妥協盡力將文本轉變為數據。

它做出有限而明智的決定。
_{它並不像您想像的那麼聰明。}

 import nlp from 'compromise'

let doc = nlp ( 'she sells seashells by the seashore.' )
doc . verbs ( ) . toPastTense ( )
doc . text ( )
// 'she sold seashells by the seashore.'

一點也不要幻想：

 if ( doc . has ( 'simon says #Verb' ) ) {
  return true
}

抓住文本的一部分：

 let doc = nlp ( entireNovel )
doc . match ( 'the #Adjective of times' ) . text ( )
// "the blurst of times?"

匹配文檔

並獲取數據：

 import plg from 'compromise-speech'
nlp . extend ( plg )

let doc = nlp ( 'Milwaukee has certainly had its share of visitors..' )
doc . compute ( 'syllables' )
doc . places ( ) . json ( )
/*
[{
  "text": "Milwaukee",
  "terms": [{
    "normal": "milwaukee",
    "syllables": ["mil", "wau", "kee"]
  }]
}]
*/

JSON文檔

避免脆性解析器的問題：

 let doc = nlp ( "we're not gonna take it.." )

doc . has ( 'gonna' ) // true
doc . has ( 'going to' ) // true (implicit)

// transform
doc . contractions ( ) . expand ( )
doc . text ( )
// 'we are not going to take it..'

收縮文檔

像數據一樣鞭打東西：

 let doc = nlp ( 'ninety five thousand and fifty two' )
doc . numbers ( ) . add ( 20 )
doc . text ( )
// 'ninety five thousand and seventy two'

數字文檔

_{- 因為它實際上是 -}

 let doc = nlp ( 'the purple dinosaur' )
doc . nouns ( ) . toPlural ( )
doc . text ( )
// 'the purple dinosaurs'

名詞文檔

在客戶端使用它：

 < script src =" https://unpkg.com/compromise " > </ script >
< script >
  var doc = nlp ( 'two bottles of beer' )
  doc . numbers ( ) . minus ( 1 )
  document . body . innerHTML = doc . text ( )
  // 'one bottle of beer'
</ script >

或同樣：

 import nlp from 'compromise'

var doc = nlp ( 'London is calling' )
doc . verbs ( ) . toNegative ( )
// 'London is not calling'

妥協是〜250KB （縮小）：

很快。它可以在鍵盤上運行：

它主要通過結合基本單詞列表的所有形式來工作。

最後的詞典約為14,000個字：

您可以在此處閱讀更多有關其工作原理的信息。很奇怪。

_{好的 -}

`compromise/one`

單詞，句子和標點符號的tokenizer 。

 import nlp from 'compromise/one'

let doc = nlp ( "Wayne's World, party time" )
let data = doc . json ( )
/* [{
  normal:"wayne's world party time",
    terms:[{ text: "Wayne's", normal: "wayne" },
      ...
      ]
  }]
*/

令牌文檔

妥協/一個人將您的文本拆分，將其包裹在方便的API中，

_{而且什麼都不做 -}

/一個是快速的 - 大多數句子要佔毫秒的第十千分之一。

它可以執行〜1MB的文本一秒鐘 - 或10個Wikipedia頁面。

無限的玩笑需要3秒。

您還可以並行化或以折衷速度將文本流式傳輸。

`compromise/two`

part-of-speech 。

 import nlp from 'compromise/two'

let doc = nlp ( "Wayne's World, party time" )
let str = doc . match ( '#Possessive #Noun' ) . text ( )
// "Wayne's World"

標記文檔

妥協/兩個自動計算每個單詞的基本語法。

_{這比人們有時意識到的更有用。}

輕語法可幫助您編寫清潔模板，並更接近信息。

妥協有83個標籤，以英俊的圖表排列。

#firstname → #person → #propernoun → #noun

您可以通過運行doc.debug()看到每個單詞的語法

您可以使用nlp.verbose('tagger')看到每個標籤的推理。

如果您喜歡Penn標籤，則可以通過以下方式得出以下方式。

 let doc = nlp ( 'welcome thrillho' )
doc . compute ( 'penn' )
doc . json ( )

`compromise/three`

Phrase和句子工具。

 import nlp from 'compromise/three'

let doc = nlp ( "Wayne's World, party time" )
let str = doc . people ( ) . normalize ( ) . text ( )
// "wayne"

選擇文檔

妥協/三是一組工具，可以放大文本的各個部分。

.numbers() .subtract()

當您有一個短語或一組單詞時，您可以使用.json()看到其他元數據

 let doc = nlp ( 'four out of five dentists' )
console . log ( doc . fractions ( ) . json ( ) )
/*[{
    text: 'four out of five',
    terms: [ [Object], [Object], [Object], [Object] ],
    fraction: { numerator: 4, denominator: 5, decimal: 0.8 }
  }
]*/

 let doc = nlp ( '$4.09CAD' )
doc . money ( ) . json ( )
/*[{
    text: '$4.09CAD',
    terms: [ [Object] ],
    number: { prefix: '$', num: 4.09, suffix: 'cad'}
  }
]*/

API

妥協/一個

輸出

.Text（） - 將文檔返回文本
.json（） - 將文檔返回作為數據
.debug（） - 精心打印了解釋的文檔
.out（） - 指定或自定義輸出
.html（{}） - 匹配的輸出自定義HTML標籤
.wrap（{}） - 生產文檔匹配的自定義輸出

UTILS

.Found [Getter] - 此文檔是空的嗎？
.docs [getter]獲取術語對象為JSON
。
.isview [Getter] - 識別折衷對象
.compute（） - 在文檔上運行命名分析
.clone（） - 文檔深編輯，因此沒有參考
.termList（） - 返回匹配中所有術語對象的平面列表
。
.uncache（） - 未凍結文檔的當前狀態，因此可以轉換
.freeze（{{}） - 以這些術語防止任何標籤被刪除
.unfreeze（{}） - 允許標籤再次更改，如默認

.all（） - 返回整個原始文檔（'Zoom'）
.terms（） - 每個單獨的術語分開結果
.first（n） - 僅使用第一個結果（s）
.last（n） - 僅使用最後一個結果（s）
.slice（n，n） - 抓住結果的子集
.eq（n） - 僅使用nth結果
.firstterms（） - 在每場比賽中獲取第一個單詞
.lastterms（） - 在每場比賽中獲取最終單詞
.fullsentences（） - 獲取每場比賽的整個句子
.groups（） - 從比賽中獲取任何命名的捕獲組
.WordCount（） - 計算文檔中的術語編號
。

匹配

（匹配方法使用Match-Syntax。）

.match（''） - 返回一個新文檔，並以父母為父母
.NOT（''） - 返回所有結果，除此之外
。 -matchone（''） - 僅返回第一場比賽
.if（''） - 返回每個當前短語，只有在包含此匹配（'block'）時才返回
.ifno（''） - 過濾帶有此匹配的任何當前短語（'notif'）
.has（''） - 如果存在這場比賽，請返回布爾值
。
.Fater（''） - 在比賽后返回所有條款
.Union（） - 返回的合併匹配無重複
.Intersection（） - 僅返回重複匹配項
。 -cymplement（） - 在另一場比賽中沒有讓一切
.settle（） - 從匹配中刪除重疊
.growright（''） - 每次比賽后立即添加任何匹配項
.growleft（''） - 在每次比賽之前立即添加任何匹配項
.grow（''） - 每次比賽之前或之後添加任何匹配項
。
。
。
。
.join（） - 在每場比賽中合併任何相鄰術語
。
.lookup（[]） - 快速查找字符串匹配數組
。

案件

.tolowercase（） - 將每個術語的每個字母都轉到下cse
.touppercase（） - 將每個術語的每個字母轉換為上限
.totitlecase（） - 上庫每個項的第一個字母
.tocamelcase（） - 每個學期刪除空格和標題案例

空格

.pre（''） - 每次比賽之前添加此標點符號或空格
.post（''） - 每次比賽后添加此標點符號或空格
.trim（） - 刪除啟動和結束空格
.HYPHENATE（） - 與連字符連接單詞，然後去除空格
.deyphenate（） - 在單詞之間刪除連字符，並設置空格
。
.toparentes（） - 在這些比賽周圍添加括號

循環

.map（fn） - 通過函數運行每個短語，並創建一個新文檔
。
.filter（fn） - 僅返回返回true的短語
.find（fn） - 返回只有匹配的第一個短語的文檔
.Mome（FN） - 如果有一個匹配短語
.random（fn） - 樣品結果的子集

插入

.replace（匹配，替換） - 用新內容搜索和替換匹配
.replacewith（替換） - 替換新文本
.remove（） - 完全從文檔中刪除這些條款
。
.insertafter（str） - 將這些新術語添加到每場比賽的結尾（附加）
.concat（） - 將這些新事物添加到最後
.swap（從lemma，tolemma） - 使用適當的共軛的智能替換根詞

轉換

.sort（'方法'） - 重新安排比賽的順序（實現）
。
.normorize（{}） - 以各種方式清理文本
.Unique（） - 刪除任何重複匹配

lib

（這些方法在主要的nlp對像上）

nlp.tokenize（str） - 解析文本而無需運行pos -tagging
nlp.lazy（str，匹配） - 通過最少分析的文本掃描
nlp.plugin（{}） - 混合在妥協-plugin中
nlp.parsematch（str） - 將任何匹配語句預先放在JSON中
nlp.world（） - 抓取或更改內部庫
NLP.Model（） - 獲取所有當前的語言數據
NLP.Methods（） - 抓取或更改內部方法
nlp.hooks（） - 請參閱哪些計算方法自動運行
nlp.verbose（模式） - 記錄我們用於調試的決策
nlp.version-庫的當前SEMVER版本
nlp.addwords（obj，isfrozen？） - 在詞典中添加新單詞
NLP.ADDAGS（OBJ） - 在標籤集中添加新標籤
nlp.typeahead（arr） - 在自動填充詞典中添加單詞
nlp.buildtrie（arr） - 將單詞列表編譯成快速查找表單
nlp.buildnet（arr） - 將匹配列表彙編為快速匹配表格

妥協/兩個：

收縮

.nartions（） - 諸如“不”之類的東西
。
.contract（） - 諸如“沒有”之類的東西

妥協/三：

名詞

.nouns（） - 返回標記為名詞的任何後續術語
- .nouns（）。 json（） - 名詞元數據超載輸出
- .nouns（）。 parse（） - 獲取令牌化名詞短語
- .nouns（）。 iSplural（） - 僅返回複數名詞
- .nouns（）。 isingular（） - 僅返回單數名詞
- .nouns（）。 toplural（） - 'football captain' → 'football captains'
- .nouns（）。 tosingular（） - 'turnovers' → 'turnover'
- .nouns（）。形容詞（） - 獲取描述此名詞的任何形容詞

動詞

.verbs（） - 返回標記為動詞的任何後續術語
- .verbs（）。 json（） - 動詞元數據超載輸出
- .verbs（）。 parse（） - 獲得令牌化的動詞短語
- .verbs（）。主題（） - 動詞動作的做法
- .verbs（）。副詞（） - 返回描述該動詞的副詞。
- .verbs（）。 isingular（） - 返回單數動詞，例如'spencer walks'
- .verbs（）。 iSplural（） - 返回複數動詞，例如“我們走路”
- .verbs（）。 iSimprative（） - 只有諸如“吃它！”之類的指令動詞
- 。 'will go' → 'went'
- 。 'walked' → 'walks'
- 。 'walked' → 'will walk'
- 。 'walks' → 'walk'
- .verbs（）。 togerund（） - 'walks' → 'walking'
- 。 'drive' → 'had driven'
- .verbs（）。 conjugate（） - 返回這些動詞的所有共軛
- 。
- 。
- 。 'went' → 'did not go'
- 。 "didn't study" → 'studied'

數字

.numbers（） - 獲取所有書面和數字值
- .numbers（）。 parse（） - 獲取令牌數字短語
- .numbers（）。 get（） - 獲取一個簡單的JavaScript號碼
- .numbers（）。 json（） - 數字元數據超載輸出
- .numbers（）。 tonumber（） - 將“五”轉換為5
- 。
- .numbers（）。 totext（） - 將'5'轉換為five
- .numbers（）。 toordinal（） - 將“五”轉換為fifth或5th
- .numbers（）。 tocardinal（） - 將“第五”轉換為five或5
- .numbers（）。 isordinal（） - 僅返回順序數字
- .numbers（）。 iscardinal（） - 僅返回基數
- 。
- 。
- 。
- 。
- 。
- .numbers（）。設置（n） - 將數字設置為n
- .numbers（）。添加（n） - 增加數字
- 。
- .numbers（）。遞增（） - 增加數字1
- .numbers（）。降低（） - 減少數量1
.money（） - 諸如'$2.50'之類的東西
- .money（）。 get（） - 檢索分析的金額
- .money（）。 json（） - 貨幣 +數字信息
- .money（）。貨幣（） - 貨幣在哪種貨幣中
.Fractions（） - 喜歡'2/3rds'或'五分之一
- .Fractions（）。 parse（） - 獲取令牌分數
- .Fractions（）。 get（） - 簡單分子，分母數據
- .Fractions（）。 json（） -JSON方法與分數數據超載
- .fractions（）。 todecimal（） - '2/3' - >'0.66'
- 。
- 。
- 。
。
- 。
- 。
- 。

句子

.senes（） - 返回帶有其他方法的句子類
- 。
- 。
- 。
- 。
- 。
- 。
- 。 ?
- 。 !
- 。 ? !

形容詞

。 'quick'
- 。
- 。
- 。
- 。 -jextives（）。
- 。
- 。
- 。

雜項選擇

.lauses（） - 將句子分為多項短語
。
.HYPHENATED（） - 所有與連字符或破折號連接的術語（如'wash-out'
.phoneNumbers（） - '(939) 555-0113'之類的東西
.hashtags（） - 諸如'#nlp'之類的東西
.emails（） - '[email protected]'之類的東西
.emoticons（） - 類似的東西:)
.emojis（） - 類似?
.ATMENTIONS（） - 諸如'@nlp_compromise'之類的東西
.urls（） - 諸如'compromise.cool'之類的東西
.pronouns（） - 諸如'he'之類的東西
.conjunctions（） - 諸如'but'類的東西
.prepositions（） - 諸如'of'
.abbreviations（） - 諸如'Mrs.'之類的東西
.people（） - 諸如“約翰·肯尼迪”之類的名字
- .people（）。 json（） - 獲取個人名稱元數據
- .people（）。 parse（） - 獲取人名解釋
.places（） - 喜歡“法國巴黎”
.ormanizations（） - 喜歡“ Google，Inc”
。
.adverbs（） - 諸如'quickly'之類的東西
- .adverbs（）。 json（） - 獲取副詞元數據
.2ronyms（） - 諸如'FBI'之類的東西
- 。
- .2ronyms（）。 addperiods（） - 將周期添加到首字母縮寫
.parentes（） - 返回內部的任何內容（括號）
- .parentes（）。條紋（） - 卸下括號
.possessives（） - 諸如"Spencer's"之類的東西
- 。
.Quotations（） - 返回配對引號中的任何條款
- .Quotations（）。條紋（） - 刪除引號
.slashes（） - 返回按斜線分組的任何條款
- 。

。延長（）：

該庫帶有英語語法的體貼，常識性的基線。

您可以自由地更改或浪費任何設置 - 這實際上是有趣的部分。

最簡單的部分只是建議任何給定單詞的標籤：

 let myWords = {
  kermit : 'FirstName' ,
  fozzie : 'FirstName' ,
}
let doc = nlp ( muppetText , myWords )

或通過妥協 - 氾濫進行更重的更改。

 import nlp from 'compromise'
nlp . extend ( {
  // add new tags
  tags : {
    Character : {
      isA : 'Person' ,
      notA : 'Adjective' ,
    } ,
  } ,
  // add or change words in the lexicon
  words : {
    kermit : 'Character' ,
    gonzo : 'Character' ,
  } ,
  // change inflections
  irregulars : {
    get : {
      pastTense : 'gotten' ,
      gerund : 'gettin' ,
    } ,
  } ,
  // add new methods to compromise
  api : View => {
    View . prototype . kermitVoice = function ( ) {
      this . sentences ( ) . prepend ( 'well,' )
      this . match ( 'i [(am|was)]' ) . prepend ( 'um,' )
      return this
    }
  } ,
} )

.plugin（）文檔

文件:

溫柔的介紹：

＃1）輸入→輸出
＃2）匹配和變換
＃3）進行聊天機器

文件:

概念	API	插件
準確性	登入	形容詞
快取	構造方法方法	日期
案件	收縮	出口
文件大小	插入	哈希
內部	JSON	html
理由	角色偏移	按鍵
詞典	循環	ngrams
匹配syntax	匹配	數字
表現	名詞	段落
插件	輸出	掃描
專案	選擇	句子
標記器	排序	音節
標籤	分裂	發音
令牌化	文字	嚴格的
指定性	UTILS	賓夕法尼亞州
空格	動詞	打字
世界數據	正常化	掃
模糊匹配	打字稿	突變
根形

會談：

語言作為界面- Spencer Kelly撰寫
編碼聊天機器人- Kahwee Teng
關於打字和數據- 由Spencer Kelly發表

文章：

與NLP和JavaScript進行社交對話- Microsoft
微服務食譜- 通過Eventn
冒險遊戲句子以妥協解析
建立基於文本的遊戲- 馬特·埃蘭德（Matt Eland）
在Bigquery與JavaScript一起玩- Felipe Hoffa
自然語言處理...在瀏覽器中？ - 查爾斯·蘭道（Charles Landau）

一些有趣的應用程序：

自動化的Bechdel測試- 守護者
故事生成框架- 何塞·菲洛卡（Jose Phrocca）
列表的Bumbler Blog-類似於馬的書籍列表 - 邁克爾·鮑科尼斯（Michael Paulukonis）
轉錄的視頻編輯- 通過新理論
瀏覽器擴展事實檢查- 亞歷山大·基德（Alexander Kidd）
Siri快捷方式- 邁克爾·伯恩斯（Michael Byrns）
亞馬遜技能- 塔吉丁·馬格尼（Tajddin Maghni）
任務Slack -Bot-凱文·蘇（Kevin Suh）[請參閱更多]

比較

妥協和興奮
妥協和NLTK

插件：

這些是一些有用的擴展：

日期

npm install compromise-dates

.dates（） - 查找日期，例如June 8th或03/03/18
- .dates（）。 get（） - 簡單啟動/結束JSON結果
- .dates（）。 json（） - 與日期元數據的超載輸出
- .dates（）。格式（''） - 將日期轉換為特定格式
- 。
- .dates（）。 tolongform（） - 將'feb'轉換為“ 2月”
.durations（） - 2 weeks或5mins
- .durations（）。 get（） - 持續時間返回簡單的JSON
- .durations（）。 json（） - 持續時間元數據超載輸出
.times（） - 4:30pm或half past five
- .times（）。 get（） - 返回時間
- .times（）。 json（） - 隨時間元數據超載輸出

統計

npm install compromise-stats

.tfidf（{}） - 按頻率和唯一性排名單詞
.ngrams（{}） - 列出所有重複子名字，
.omigrams（） - 一個單詞
.bigrams（） - 兩個單詞的n -grams
.trigrams（） - 帶有三個單詞的n -grams
.startgrams（） - n -grams，包括短語的第一項
.endgrams（） - n -grams，包括短語的最後一項
。

演講

npm install compromise-syllables

。
.soundslike（） - 產生估計的發音

維基百科

npm install compromise-wikipedia

.Wikipedia（） - 壓縮文章對帳

打字稿

我們致力於在Main和官方Plugins中獲得打字稿/DENO的支持：

 import nlp from 'compromise'
import stats from 'compromise-stats'

const nlpEx = nlp . extend ( stats )

nlpEx ( 'This is type safe!' ) . ngrams ( { min : 1 } )

打字稿文檔

限制：

Slash-Support：我們目前將Slash slash作為不同的單詞，就像我們為連字符所做的那樣。因此，像這樣的事情不起作用： nlp('the koala eats/shoots/leaves').has('koala leaves') //false
句子間匹配：默認情況下，句子是頂級抽象。沒有插件的句子間句子或多句子匹配不支持： nlp("that's it. Back to Winnipeg!").has('it back')//false
嵌套匹配語法： Regex的危險之處在於您可以無限期地反復出現。我們的比賽語法要弱得多。 （尚不可能）這樣的事情： doc.match('(modern (major|minor))? general')必須通過連續的.match（）語句來實現複雜匹配。
依賴性解析：正確的句子轉換需要理解句子的語法樹，我們目前不這樣做。我們應該！需要幫助。

常問問題

☂️也不是JavaScript ...

？它可以在我的arduino-watch上運行嗎？

？其他語言妥協？

部分構建？

（Spencer's Cool）

（Spencer的房子）

參見：

en-pos-亞歷克斯·科維（Alex Corvi）
NaturalNode- javaScript中的thaternode-統計NLP
winkjs- pos-tagger，dokenizer，JavaScript中的機器學習
dariusk/pos -js- javaScript中的fasttag fork
Compendium -js- JavaScript中的POS和情感分析
節點語言學- 結合，JavaScript中的變形
retext- JavaScript中的非常令人印象深刻的文本實用程序
上標- JS中的對話引擎
JSPO- javaScript構建經過時間測試的Brill-Tagger
Spacy- c/python中的快速，多語言標記器
散文- Joseph Kato的快速標記
TextBlob -Python Tagger

麻省理工學院

展開

附加信息

版本 14.14.3
類型其他源碼
更新時間 2025-04-16
大小 3.41MB
來自於 Github

相關應用

Google Dorks

2025-03-10
shepherd

2025-06-04
mongo express

2025-06-04
hidusbf

2025-02-14
Free Algorithms Books

2025-05-29
markdownpedia

2025-04-22

爲您推薦

chat.petals.dev

其他源碼

1.0.0
GPT Prompt Templates

其他源碼

1.0.0
GPTyped

其他源碼

GPTyped 1.0.5
Google Dorks

其他源碼

1.0
shepherd

其他源碼

v6.1.6-react-shepherd: Prepare Release (#3063)
mongo express

其他源碼

v1.1.0-rc-3
Google Dorks

其他源碼

1.0
shepherd

其他源碼

v6.1.6-react-shepherd: Prepare Release (#3063)
mongo express

其他源碼

v1.1.0-rc-3

相關資訊全部

compromise

compromise/one

compromise/two

compromise/three

API