Unfortunately I havent yet compared the performance of these algorithms on benchmarks. There is some comparison data available in Kohlschutter's paper https://www.l3s.de/~kohlschuetter/publications/wsdm187-kohlschuetter.pdf (that compares individual features, not all the different algorithms). This is another interesting comparison https://web.archive.org/web/20120606173919/http://tomazkovacic.com/blog/122/evaluating-text-extraction-algorithms