Published 12-04-2024
Keywords
- BERT transformer,
- Bidirectional Encoder Representations from Transformers
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
How to Cite
Abstract
Transfer learning is a new craze in the field of natural language processing, not just for beating task-specific performance records in many natural language processing tasks by huge computation and parameter settings with extensive training, but also with so much of model reusability. The transformer's rise, beginning with the attention is all you need paper by Vaswani et al. with the introduction of BERT transformer (Bidirectional Encoder Representations from Transformers) by the Google research team, has been spectacular. BERT has been designed to pre-train deep bidirectional representations from unlabelled text by jointly conditioning on left and right context in all layers and has subsequently paved way for many transformer-based models since November 2018.
Transfer learning is the new craze in modern natural language processing applications. Established by the concept of transfer learning, with more data and training time, much larger models have been built pre-trained on a large amount of training examples. These models are then fine-tuned with much smaller datasets for different linguistic tasks and applications. As an outcome of this modern trend, much better performance on the fine-tuning tasks has been observed. In this chapter, we look at various training paradigms of transfer learning. We explore a few fine-tuning methods and list several state-of-the-art results of different linguistic tasks. Finally, we talk about deployment in a real-world or for commercial natural language processing tasks.
Downloads
References
- Pulimamidi, Rahul. "To enhance customer (or patient) experience based on IoT analytical study through technology (IT) transformation for E-healthcare." Measurement: Sensors (2024): 101087.
- Pargaonkar, Shravan. "The Crucial Role of Inspection in Software Quality Assurance." Journal of Science & Technology 2.1 (2021): 70-77.
- Menaga, D., Loknath Sai Ambati, and Giridhar Reddy Bojja. "Optimal trained long short-term memory for opinion mining: a hybrid semantic knowledgebase approach." International Journal of Intelligent Robotics and Applications 7.1 (2023): 119-133.
- Singh, Amarjeet, and Alok Aggarwal. "Securing Microservices using OKTA in Cloud Environment: Implementation Strategies and Best Practices." Journal of Science & Technology 4.1 (2023): 11-39.
- Singh, Vinay, et al. "Improving Business Deliveries for Micro-services-based Systems using CI/CD and Jenkins." Journal of Mines, Metals & Fuels 71.4 (2023).
- Reddy, Surendranadha Reddy Byrapu. "Predictive Analytics in Customer Relationship Management: Utilizing Big Data and AI to Drive Personalized Marketing Strategies." Australian Journal of Machine Learning Research & Applications 1.1 (2021): 1-12.
- Thunki, Praveen, et al. "Explainable AI in Data Science-Enhancing Model Interpretability and Transparency." African Journal of Artificial Intelligence and Sustainable Development 1.1 (2021): 1-8.
- Reddy, Surendranadha Reddy Byrapu. "Ethical Considerations in AI and Data Science-Addressing Bias, Privacy, and Fairness." Australian Journal of Machine Learning Research & Applications 2.1 (2022): 1-12.