<div>Hi all Xapian-devel,</div><div><br></div><div>Gist: <a href="https://gist.github.com/10d2222d8bffe8d7631d">https://gist.github.com/10d2222d8bffe8d7631d</a></div><div><br></div>I'm using Xapian-TermGenerator to extract Norwegian sentences to vsm (vector space model) using TermGenerator. But when I test generating vsm from 'Truet med å stevne misfornøyd PC-kunde - PC-leverandøren Asus likte svært dårlig kundens misfornøyde leserbrev.' It doen't return 'asus' result in vsm.<div>
<br></div><div>So I've tried to replace 'Asus' with other word such as Acer, Apple, Dell, Fujitsu, HP, Lenovo, LG, NEC, Samsung, Sony and Toshiba. Most brand words I tried are able to get a result except Acer, Apple and Dell, but other words which get its name as result aren't get 'dår'.</div>
<div><br></div><div>This problem may be caused by encoding which I'm investigating now. But it would be great if you guys can help and if you guys have any question regarding this problem please reply to me </div><div>
<br></div><div>Best regards,</div><div>Theerapat</div>