- Analysis, Models & Methods -
Transfer Learning | FastText Embeddings
One powerful aspect of language models is their ability to learn under self-supervised conditions. This means that labeled data isn't needed for training. Instead, these models use attention to learn relationships and context relationships within the text. While this is an amazing feat, it requires a significant volume of text data for a model to learn word embeddings from scratch. Instead, it is common to implement existing embeddings that have already been trained on large amounts of data. Transfer learning, allowes the LSTM to focus on the task of text-classification (identifying sentiment in reviews) rather than starting at ground zero in learning language modeling. Instead of learning word embeddings from scratch, which is extremely computationally expensive and requires a high volume of data, by leveraging existing, pre-trained vectors, the model could perform better on the tuning task. There are many well known pre-trained embeddings such as Glove, Word2Vec, or in this case: FastText.
Like its preceding models, GPT-3.5 Turbo is trained on language modeling (i.e. predicting the next word in a sequence), however in addition, this model version also incorporated:
Zero-Shot Learning - Context prompt/instructions given with input
Single-Shot Learning - Context prompt & an example input/output pairing
Few-Shot Learning - Context prompt & multiple input/output examples (10-100)
Positive & Negative Review Summarization:
Goal: consolidate MANY reviews into 2 digestible summaries outlining the main Pro's and Con's identified across all of the reviews.
Zero-shot learning structure used to generate summaries (zero-shot used in order to reduce token usage due to limitations and cost)