TensorFlow examples (text-based)

This page lists text-based examples using TensorFlow.

This is a good post. It introduces how to train the model using your own dataset.

To create a useful model you should train it on a large dataset. Ideally, the dataset should be specific for your task. Summarizing news article may be different from summarizing legal documents or job descriptions.

Full example can be found in TensorFlow examples (DNN-based text classification with DBpedia data): https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/learn/text_classification.py (note, that code there will be updated with new APIs so it’s better to check out there).

Another text classification using CNN (CNN-based text classification with DBpedia data):

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/learn/text_classification_cnn.py

It contains sample code for feeding customized training data set from csv files. It used a simple logistic regression classifier to classify Emails.

Its code on GitHub: Convolutional Neural Network for Text Classification in Tensorflow (python 3)  by dennybritz  on Github (Python 2 version by atveit on Github, this one forked the python 3 version by dennybritz)

Note that python 3 version has more functionality (e.g., eval.py) and it is more up to date.

tf.device("/cpu:0") forces an operation to be executed on the CPU. By default TensorFlow will try to put the operation on the GPU if one is available, but the embedding implementation doesn’t currently have GPU support and throws an error if placed on the GPU.

Johnson, R., & Zhang, T. (2014). Effective use of word order for text categorization with convolutional neural networks. arXiv preprint arXiv:1412.1058.