Info This post is auto-generated from RSS feed Hacker News. Source: Tokenization for language modeling: BPE vs. Unigram Language Modeling (2020)