Korean, Edit

LLM Useful Functions Collection

Recommended posts: 【Python】 Python Useful Functions Collection, 【Algorithms】 Lecture 21. NLP and LLM


1. NLP useful functions

2. LLM applications

3. LLM agents

4. AI Scientist



1. NLP useful functions

⑴ A function that automatically translates a given sentence into English


! pip install --upgrade googletrans httpx httpcore deep_translator

def to_english (sentence):
    from deep_translator import GoogleTranslator
    translated = GoogleTranslator(source='auto', target='en').translate(sentence)
    return translated

print( to_english("나는 소년입니다.") )
# I am a boy.

print( to_english("단핵구") )
# monocytes


⑵ A function that automatically translates a given sentence into Korean


def to_korean (sentence):
    from deep_translator import GoogleTranslator
    translated = GoogleTranslator(source='auto', target='ko').translate(sentence)
    return translated

print( to_korean("I am a boy.") )
# 저는 남자입니다.


⑶ A function that automatically translates a given sentence into Japanese


def to_japanese (sentence):
    from deep_translator import GoogleTranslator
    translated = GoogleTranslator(source='auto', target='ja').translate(sentence)
    return translated

print( to_japanese("I am a boy.") )
# 私は男の子です。


Translation of a whole Markdown document

⑸ A function that turns an arbitrary variable-length natural language sentence into a 384-dimensional vector, considering its meaning (cf. CELLama)


from langchain.embeddings.sentence_transformer import SentenceTransformerEmbeddings
import numpy as np
from scipy.sparse import csr_matrix
import pandas as pd
from sklearn.neighbors import NearestNeighbors
import torch
from torch.utils.data import DataLoader, TensorDataset
from xgboost import XGBClassifier

def sentences_to_embedding(sentences):
    embedding_function = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")
    db = embedding_function.embed_documents(sentences)
    emb_res = np.asarray(db)
    return emb_res


sentences = []
sentences.append("What is the meaning of: obsolete")
sentences.append("What is the meaning of: old-fashioned")
sentences.append("What is the meaning of: demagogue")
emb_res = sentences_to_embedding(sentences)



2. LLM applications

ollama.ai (free): Llama2/3/4, Phi-3, Mistral, Gemma, etc

GroqChat (free): Mixtral, Llama3, Gemma

OpenRouter (charged): Can utilize ChatGPT API, etc.

⑷ English–Korean translation using Llama2 (free)


import ollama

def english_to_korean(sentence):
    content = 'Translate "' + sentence + '" to Korean. Output only the translated sentence.'
    response = ollama.chat(model='llama2', messages=[
      {
        'role': 'user',
        'content': content,
      },
    ])
    return response['message']['content']
    
sentence = "I am a boy."
english_to_korean(sentence)


⑸ Determine whether a given sentence contains a chemical formula


import ollama

def is_chemical_formula(sentence):
    content = 'Please determine if "' + sentence + '" contains a chemical formula or not. If it is correct, answer "sure"; otherwise, "no".'
    response = ollama.chat(model='llama2', messages=[
      {
        'role': 'user',
        'content': content,
      },
    ])
    return response['message']['content']
    
sentence = "NH2OH is an amine."
result = is_chemical_formula(sentence)
print(result)
print('sure' in result.lower())

sentence = "I am a boy."
result = is_chemical_formula(sentence)
print(result)
print('sure' in result.lower())

### Output ###
'''
The term "NH2OH" does contain a chemical formula, so the answer is "yes" or "sure".
True
The statement "I am a boy" does not contain any chemical formulas, so the answer is "no".
False
'''


⑹ Determine whether a given noun is a proper noun or a common noun


import ollama

def is_proper_noun(noun):
    content = 'Please determine if "' + noun + '" is a proper noun or common noun. If it is a proper noun, answer "proper"; otherwise, "common".'
    response = ollama.chat(model='llama2', messages=[
      {
        'role': 'user',
        'content': content,
      },
    ])
    return response['message']['content']
    
sentence = "Pencil"
result = is_proper_noun(sentence)
print(result)
print('proper' in result.lower())

sentence = "Feynman"
result = is_proper_noun(sentence)
print(result)
print('proper' in result.lower())

### Output ###
'''
"Pencil" is a common noun. Therefore, the answer is "common".
False

"Feynman" is a proper noun. Therefore, the answer is "proper".
True
'''



3. LLM agents

⑴ Overview: 2026 is the year of LLM agents

OpenClaw (clawd.bot): An open-source project made by an independent developer, Peter Steinberger (steipete). Anthropic requested a name change due to trademark issues (cf. Claude Code), so it was rebranded as Moltbot → OpenClaw.

⑶ Claude Code: An LLM-based agentic coding tool made by Anthropic. The specific models are Haiku (fast and inexpensive), Sonnet (balanced), Opus (high performance)

⑷ ChatGPT Pro

Google Gemini and NotebookLM User Guide



4. AI Scientist

The Horizon of Cognition and the Breakthrough Called AI

⑵ FunSearch

AlphaGeometry

Sakana AI

FutureHouse

James Zou

AI-Descartes: Symbolic methods to rediscover Kepler’s third law

Theorizer



Input: 2024.02.10 13:34

Edited: 2026.01.29 00:43

results matching ""

    No results matching ""