Korean, Edit

LLM Useful Functions Collection

Recommended posts: 【Python】 Python Useful Functions Collection, 【Algorithms】 Lecture 21. NLP and LLM


1. NLP useful functions

2. LLM applications

3. LLM agents

4. AI Scientist



1. NLP useful functions

⑴ A function that automatically translates a given sentence into English


! pip install --upgrade googletrans httpx httpcore deep_translator

def to_english (sentence):
    from deep_translator import GoogleTranslator
    translated = GoogleTranslator(source='auto', target='en').translate(sentence)
    return translated

print( to_english("나는 소년입니다.") )
# I am a boy.

print( to_english("단핵구") )
# monocytes


⑵ A function that automatically translates a given sentence into Korean


def to_korean (sentence):
    from deep_translator import GoogleTranslator
    translated = GoogleTranslator(source='auto', target='ko').translate(sentence)
    return translated

print( to_korean("I am a boy.") )
# 저는 남자입니다.


⑶ A function that automatically translates a given sentence into Japanese


def to_japanese (sentence):
    from deep_translator import GoogleTranslator
    translated = GoogleTranslator(source='auto', target='ja').translate(sentence)
    return translated

print( to_japanese("I am a boy.") )
# 私は男の子です。


Translation of a whole Markdown document

⑸ A function that turns an arbitrary variable-length natural language sentence into a 384-dimensional vector, considering its meaning (cf. CELLama)


from langchain.embeddings.sentence_transformer import SentenceTransformerEmbeddings
import numpy as np
from scipy.sparse import csr_matrix
import pandas as pd
from sklearn.neighbors import NearestNeighbors
import torch
from torch.utils.data import DataLoader, TensorDataset
from xgboost import XGBClassifier

def sentences_to_embedding(sentences):
    embedding_function = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")
    db = embedding_function.embed_documents(sentences)
    emb_res = np.asarray(db)
    return emb_res


sentences = []
sentences.append("What is the meaning of: obsolete")
sentences.append("What is the meaning of: old-fashioned")
sentences.append("What is the meaning of: demagogue")
emb_res = sentences_to_embedding(sentences)



2. LLM applications

ollama.ai (free): Llama2/3/4, Phi-3, Mistral, Gemma, etc

GroqChat (free): Mixtral, Llama3, Gemma

OpenRouter (charged): Can utilize ChatGPT API, etc.

⑷ English–Korean translation using Llama2 (free)


import ollama

def english_to_korean(sentence):
    content = 'Translate "' + sentence + '" to Korean. Output only the translated sentence.'
    response = ollama.chat(model='llama2', messages=[
      {
        'role': 'user',
        'content': content,
      },
    ])
    return response['message']['content']
    
sentence = "I am a boy."
english_to_korean(sentence)


⑸ Determine whether a given sentence contains a chemical formula


import ollama

def is_chemical_formula(sentence):
    content = 'Please determine if "' + sentence + '" contains a chemical formula or not. If it is correct, answer "sure"; otherwise, "no".'
    response = ollama.chat(model='llama2', messages=[
      {
        'role': 'user',
        'content': content,
      },
    ])
    return response['message']['content']
    
sentence = "NH2OH is an amine."
result = is_chemical_formula(sentence)
print(result)
print('sure' in result.lower())

sentence = "I am a boy."
result = is_chemical_formula(sentence)
print(result)
print('sure' in result.lower())

### Output ###
'''
The term "NH2OH" does contain a chemical formula, so the answer is "yes" or "sure".
True
The statement "I am a boy" does not contain any chemical formulas, so the answer is "no".
False
'''


⑹ Determine whether a given noun is a proper noun or a common noun


import ollama

def is_proper_noun(noun):
    content = 'Please determine if "' + noun + '" is a proper noun or common noun. If it is a proper noun, answer "proper"; otherwise, "common".'
    response = ollama.chat(model='llama2', messages=[
      {
        'role': 'user',
        'content': content,
      },
    ])
    return response['message']['content']
    
sentence = "Pencil"
result = is_proper_noun(sentence)
print(result)
print('proper' in result.lower())

sentence = "Feynman"
result = is_proper_noun(sentence)
print(result)
print('proper' in result.lower())

### Output ###
'''
"Pencil" is a common noun. Therefore, the answer is "common".
False

"Feynman" is a proper noun. Therefore, the answer is "proper".
True
'''



3. LLM agents

⑴ Overview: 2026 is the year of LLM agents

OpenClaw (clawd.bot)

① An open-source project made by an independent developer, Peter Steinberger (steipete).

② Anthropic requested a name change due to trademark issues, so it was rebranded as Moltbot → OpenClaw.

⑶ Claude Code

① An LLM-based agentic coding tool made by Anthropic. The specific models are Haiku (fast), Sonnet (balanced) and Opus (high performance)

② How to set-up

Step 1. Install Antigravity

Step 2. Configure the initial settings upon first launch after installation (ref)

Step 3. Sign in with your Google account

Step 4. After accessing Antigravity, download three Python packages and Claude Code from the Marketplace

Step 5. Reopen Antigravity

Step 6. Connect the workspace

Step 7. Type claude in terminal and link google account

③ Useful skills (ref1, ref2, ref3, ref4, ref5)

⑷ ChatGPT Pro

⑸ Google Ecosystem

① Model: Gemini, Gemma

② Research: NotebookLM

③ Design: Stitch (text → UI), Whisk (image visualization)

④ Text → Video: Veo, Google Vids

⑤ Coding: Antigravity (AI-based IDE), Jules (coding secretary)

⑥ Agent connecting the above tools: A2A, ADK



4. AI Scientist

The Horizon of Cognition and the Breakthrough Called AI

⑵ FunSearch

AlphaGeometry

The AI Scientist (Sakana AI, Nature (2026))

FutureHouse

The Virtual Lab (Youtube clip), The Virtual Biotech

AI-Descartes: Symbolic methods to rediscover Kepler’s third law

Theorizer

⑼ CodeScientist (Jansen et al., 2025)

⑽ AgentLab (Schmidgall et al., 2025)

⑾ Popper (Huang et al., 2025)

⑿ HypoBench (Liu et al., 2025)

AutoDiscovery: A model that generates creative hypotheses using an uncertainty (surprisal) measure based on the Beta distribution.

Towards an AI co-scientist

The AI grad student

Kosmos

⒄ Google Medical AI Ecosystem


Google Medical AI Agent Revolution

Figure 1. Google Medical AI Ecosystem



Input: 2024.02.10 13:34

Edited: 2026.01.29 00:43

results matching ""

    No results matching ""