3. PROBLEMS:
• Irrelevant quoting and examples.
• Doesn’t discriminate between existing and obsolete laws.
• Hallucination.
• Fake laws and articles.
• Might confuse between other countries law when finetuning
an existing open sourced model.
4. REFERENCE PAPERS:
• Lawyer LLaMA: Enhancing LLMs with Legal Knowledge
• ChatLaw: Open-Source Legal Large Language Model with
Integrated External Knowledge Bases
• Llama 2: Open Foundation and Fine-Tuned Chat Models
• Training language models to follow instructions with human
feedback
• Llama 2: Open Foundation and Fine-Tuned Chat Models
5. DATA SET:
• Chinese Legal Corpus
• National Judicial Examination
• JE-Q2EA, JE-QA2E and JE-EXPERT
• Legal Consultation
6.
7. • SUPERVISED FINE TUNING
• RETRIEVAL AUGMENTED GENERATION
• Train with human intervention
• Supervised Finetuned data
• Train with random articles to alleviate hallucination and
stop it from quoting all articles including the irrelevant
one.
8. ILDC (Indian Legal Documents Corpus)
Law school examinations
Focus on particular segment