QT TF-IDF

For the "QT TF-IDF" file, the code:

idf = pd.DataFrame(np.round(tfidf_transformer.idf_, 2),

                      index=count_vectorizer.get_feature_names(), columns=["idf"]),



I dont understand what is the underscore mean for "idf_" in "tfidf_transformer.idf_"

The idf_ in tfidf_transformer.idf_ gives the inverse document frequency, which means the inverse of the number of times a term appears in a document.



Can read more from here.

But I dont understand why there is error if I just type " tfidf_transformer.idf" rather than "tfidf_transformer.idf"

The error you are getting is because TfidfTransformer object has no attribute idf. Instead, TfidfTransformer object has attribute idf_, which gives the value of inverse document frequency.



I think you are confused with the idf variable where we fit the model.



Try running the below code where we are not storing the fitted model in idf variable. You will get more clarity.


Import TfidfTransformer from sklearn 

from sklearn.feature_extraction.text import TfidfTransformer


Instantiate TfidfTransformer and set 'smooth_idf = False' and norm = None. 

tfidf_transformer = TfidfTransformer(smooth_idf=False, norm=None)


Fit the model

tfidf_transformer.fit(word_count)


Print IDF values

idf = pd.DataFrame(np.round(tfidf_transformer.idf_, 2),

                      index=count_vectorizer.get_feature_names(), columns=["idf"])



idf.T