2. Initiative 1. Future writing and project plan
3. Initiative 2. Content acquisition plan
4. Initiative 3. Image generation plan
5. Initiative 4. Translation project
6. Initiative 5. Publication project
7. Initiative 6. Apps on a blog
8. Initiative 7. Human-computer interface (HCI)
9. Initiative 8. Research on the nature of knowledge
10. Closing
1. Before entering,
It is a blog that started as a diary in 2007. However, through this blog, I became special and wanted to communicate with the world through this blog. Then, while watching the match between Lee Se-dol and AlphaGo in 2016, I wanted to sketch a tree about human knowledge on this blog. And now, I would like to complete this project in earnest.
Mission 1. Realization of the democratization of knowledge
○ Because education is no longer equal…
○ Because education no longer shows hopes and dreams…
○ A hope for many countries that is getting sick…
○ People often get frustrated because they find it difficult to study something spending a lot of time unnecessarily.
○ There were many cases where people don’t fully enjoy the pleasure of satisfying their desire for knowledge.
○ Strong measures should be taken because enormous educational expenses make many countries sick.
○ By the way, will the countries’ enthusiasm for education be lowered even if the final and best knowledge system for all fields is disclosed?
○ However, the diversity of available knowledge is related to a pluralistic society, which will alleviate the high educational fervor characteristic of a monistic society.
○ The monopoly of knowledge runs out faster than expected. The day to realize the equality of knowledge will not be that far away.
○ The democratization of knowledge reduces social conflict.
○ Inequality in knowledge causes conflict between classes.
○ The disclosure of the right knowledge corrects the wrong knowledge. Right knowledge is always simple and clear.
○ Secrecy creates unnecessary illusions.
○ Equality of knowledge is a way to strengthen the power of the people.
○ While ChatGPT seems to realize the democratization of knowledge, some scholars argue that it may increase the knowledge gap horribly.
○ What matters is the democratization of knowledge for people.
○ Advice from a man named “Avocado Addict”
○ In an era where AI dominates AI, the collapse of capitalism will accelerate and return to a self-sufficient society will proceed. For those who are lagging behind this era, democratization of knowledge is essential.
Mission 2. Prevention of human cognitive saturation
○ The arrival of singularity : It is expected that in the next 50 years, the speed at which knowledge accumulates will be faster than the speed at which mankind as a collective wisdom understands it.
○ The timing seems to be earlier by the too early advent of the ChatGPT.
○ Information expansion : There is still concerns about explosive growth of IT and bio-information.
○ Prevention of saturation of IT information : Could be resolved by cooperating with the IT industry.
○ Prevention of saturation of bio-information : Could be resolved by cooperating with the research community. A gene-centered system could be built to realize it.
○ How can one person collect all the knowledge of mankind?
○ Answer 1. Crowd sourcing model : A strategy building a business model for collecting and sharing knowledge by the majority.
○ The most important conclusion people find while studying economic history. The world has always demonstrated Increasing Returns of Scales. The increase in population led to irreversible technological development and people’s lives became richer. Then, the more knowledge we share, the more people gather, and the more knowledge we gather, the more knowledge we can create.
○ Currently, known business models (BMs) such as Wikipedia, Github, and Naver Knowledge-in are well known for using this approach.
○ Answer 2. AI Artificial Intelligence Model : An idea of living and evolving letters.
○ A manuscript is a living and breathing creature. So we have to constantly fill in new ideas into the written.
○ Everyone can write, but not everyone can breathe life into manuscripts. Thus, it may be a new field without competition.
○ Answer 3. Social media model : Building a network in the form of a point organization.
○ The business model I think of is a form of branch organization. The vertical class adopted by large companies is not suitable for modern times when individuals’ desire for self-realization has become stronger. I am planning a strategy in which one person becomes the center of network by forming a one-to-one private contract rather than a vertical class. This is what I saw and felt at the startup that I was with from the beginning.
○ Answer 4. Insights on Knowledge
○ The core concept is based on the fact that there is a small percentage of the total knowledge.
○ I can think of the contents of the core concept as a summary note for studying for the exam.
○ To understand the exact definition of the core concept, refer to the surprisal from information theory.
○ However, ChatGPT can tell us the details just by querying the core meta-knowledges, so it can cover all the details of the knowledge, indeed.
Mission 3. Attempts to preserve human knowledge
○ Knowledge must be equal.
○ Immediately after the collapse of the ancient Roman Empire, the vast technology possessed by the Roman Empire was lost.
○ Humans are always facing extinction scenarios with high probability : The lifespan of the earth’s civilization is hundreds of years at most…
○ Climate change
○ Nuclear war
○ Asteroid collision
○ Infectious disease outbreak
○ Rebellion of AI
○ The current blog has the mission to preserve human knowledge in case human knowledge is lost.
Figure. 1. An animation of Dr. Stone Illustration
An animation of Dr. Stone also showed me a similar idea.
○ Like the lesson from the blockchain technologies, share makes knowledge forever.
○ Of course, I will definitely prevent the extinction of mankind and share scenarios in preparation for this.
○ It is also desirable to plan a complete and flawless data storage so that alien civilization can follow human knowledge in the distant future.
○ Or, it shoul also be considered to prepare the seeds of knowledge to start civilization on another planet.
○ In other words, wouldn’t it be needed to build an encyclopedia of all things when civilization on an alien planet is being built in hundreds of years?
Mission 4. Accelerating knowledge acquisition
○ As the tempo of music is getting fast and repetitive videos like TikTok are getting popular, wouldn’t it be future-oriented to quickly acquire knowledge?
○ Low birth rate is a phenomenon experienced by all developed countries. In an era when a few have to support the majority, the pace of knowledge acquisition must be accelerated.
○ Accelerating knowledge acquisition can be a good measure to solve a woman’s career break.
○ Human memory is not eternal. Don’t experts also need a knowledge system that can quickly recall forgotten knowledge?
○ Accelerating knowledge acquisition will be a factor that increases the power of the human stance in the future conflict between humans and AI.
○ With the rising cost of education, young people are entering society later. In this situation, isn’t there a need to accelerate the acquisition of knowledge?
○ Isn’t the acceleration of knowledge acquisition essential for you who have not yet finished challenging?
2. Initiative 1. Future writing and project plan
○ September 2024: University Mathematics Competition, Organic Chemistry Problem Solving
○ October 2024: US Patent Bar Exam
3. Initiative 2. Content acquisition plan (in progress)
2-1. Next Generation Search Engine Project (in progress) : Search Engine, AI
I don’t think I can create content for all areas here. This is because 1) securing expertise that exceeds all experts in each field is almost impossible at the moment, 2) even if it is possible through artificial intelligence, it is not desirable to threaten existing jobs without fostering new industries, and 3) dealing with all deep concepts can greatly damage beginners’ understanding.
So my strategy is to 1) focus on collecting relatively easily crawled knowledge (2-1), 2) generate content through AI (2-2), get AI assistants to collect unstructured knowledge (2-3), and 4) promote public cooperation through crowd-sourcing (2-4).
Unlike traditional search engines with graph-structured data structures and list-structured Wikipedia, this search engine is differentiated by its knowledge management system (KMS) with tree structures. I personally think the tree structure is the most similar to the human cognitive structure. Also, I am considering a private KMS that provides KMS to the company or a means that promotes government policies by providing KMS to the government. I think we need a “real” guide for hitchhikers traveling in the Milky Way… This will help 1) acquire knowledge quickly, 2) ensure the integrity of knowledge (the consistent nature of grammar and content) and non-redundant (the nature that allows only the necessary level of concept overlap), 3) communicate within the company, and 4) facilitate comparison of patent claims.
What I felt while working in the start-up from the beginning is that it is more difficult than I thought to maintain the integrity of knowledge. In this regard, I observed that 1) the results of data analysis on different days should not be different, but I didn’t know what parameters made them different when they were different, 2) communication between experts in different fields was poor, and 3) only the first time we know a certain knowledge can reveal what knowledge is new to us. And regarding 3), I have recorded a lot of new knowledge since I was quite young, so I was able to build a tree at the current level, but people who know too much may not know what new knowledge for them is.
○ Example 1. Digest Wiki project (not started)
○ In 2-1, Wikipedia was chosen as the number one target in relation to gathering knowledge that can be crawled relatively easily. A sudden thought came to mind with the end of the translation project. Let’s take an example of the word “Agrobacterium tumefaciens” found in a sentense. If anyone is curious about” Agrobacterium tumefaciens”, he or she will use the search engine in my materials of “DNA technology”. He or she will then be able to identify which subcategories specifically include the word “Agrobacterium tumefaciens” in the “DNA technology” posting, and to some extent answer his or her original curiosity. After all, what people are curious about is information, what I can provide is contextual information, and the means of attracting people is probably terminology. The work of crawling all the pages of the Wiki project and securing a set of jargon dictionaries will begin shortly.
○ Example 2. Law should be equal to all (not started)
○ The law must not only have equal effects, but also equal awareness for everybody. I came up with an idea of the legal maturity of all mankind. I can secure all legal documents in my country through the National Legal Information Center, and in this way, I can secure all legal documents around the world. In addition, collecting all the citation relationship between provisions will ensure the interconnection of all legal texts, that is, the network. Then, if I run the clustering algorithm, I will have the appropriate hierarchy. This will make it easier for anyone to quickly understand legal knowledge.
○ There are 1,580 laws in South Korea (as of 2022), and there may be constitution, orders, ordinances, and rules, so thousands of posts will be created.
○ The laws of the United States vary from state to state, so…
○ If Chernoff face is used, the recommendation of a litigation strategy for each judge could be possible.
○ Example 3. Book Network (not started)
○ The number of books is too large. If I cluster all the books, can I know which one is more important and which one should I read first?
○ I think it has been already tried it on Amazon (amazon.com).
○ I can simply implement a heatmap for books using their mutual citation relationships. (ref)
○ I can create maps of books based on the LLM algorithm. (ref)
○ Example 4. OCR (image to text) (in progress)
○ This can be implemented using large multi-modal modeling, such as GPT-4V.
○ These make it easy to convert image data and publications into digital data.
○ Example 5. Industry standardization system (in progress)
○ It is to link the data of the top experts in each field to establish industry standards. Who the first experts are selected will follow internal evaluation and will change over time. It will have an exclusive function similar to a patent and will be submitted to obtain industrial standards.
○ Example 6. Foundation Model (in progress)
○ A foundation model that extracts meaningful information from a heap of crawled knowledge. This exciting project started with the idea of deriving useful scores from sequencing data and the realization of it could also be applied to blogs. Perhaps I might even need to create my own transformer model.
2-2. AI Data Generation (in progress) : AI
○ Overview
○ AlphaFold2 database released in July 2022. With about 200 million protein structure databases, it has produced 1,000 times more protein structures in a year than humans have experimentally discovered in 50 years. Isn’t AI’s generation of vast amounts of data beyond the cognitive level a way to eventually move forward in acquiring content? And isn’t the database for managing and withdrawing it important? I think that will be the final form of this blog.
○ In other words, a strategy that automatically generates data through AI and curates the generated data.
○ Method 1: AI-based Database (in progress)
○ Example 1: Biopolymer Library : Uploading various 3D model information on the frontend.
○ Example 2: Homologous/Heterologous Ligand-Receptor Interaction Affinity Database: Target completion by August 2024
○ Method 2: Automatically generating tags for posts given by LLM (complete)
○ The tags are not significantly affected by hallucination, so it’s okay
○ Method 3: Problem Bank AI (in progress)
○ If generative AI models can create new knowledge, then it’s also feasible to create AI for generating problems, i.e., a Problem Bank AI
○ Example 1: TOEFL speaking problem
○ Example 2: Organic Chemistry Nomenclature 100,000 Questions (complete)
○ Example 3: Organic Chemistry Isomers 2,300 Questions (complete)
○ Example 4: Organic Chemistry Spectroscopy 100 Questions (complete)
○ Using problems generated by AI as learning data, it’s possible to implement AI that solves problems, i.e., AI that generates hypotheses or theories
○ Specifically, the problem-solving AI is presumed to be a series of serial and parallel circuits of LLM components (ref)
○ ○ Implementing the above idea, an AI that solves IMO problems appears in Nature DeepMind in 2024 (ref).
2-3. Evolving and Living AI (in progress) : Crawling, NLP
○ Basic idea
○ Humans eat with their mouths like animals, but unlike animals, they eat knowledge with their heads.
○ Just as food must be broken down into monomers to digest food, texts must be broken down into concepts to digest.
○ Of course, the context of the relationship between sentences is not meaningless, but it is only meaningful to make the concept more clear.
○ Just as living things constantly metabolize and evolve, knowledge must be alive and able to evolve.
○ Method 1. A model that crawls knowledge on its own (not started)
○ Premise: The knowledge system has the existing integrity, non-redundancy, unity, and brevity.
○ Integrity : The nature of the content without errors.
○ Non-redundant : Properties that do not overlap more than necessary.
○ Integrity : The nature of a vast amount of text being connected to a single flow.
○ Brevity : The characteristic of the style being simple without any unnecessities.
○ Step 1. sentence-by-sentence
○ Step 2. Summarize only sentences that one can understand oneself : Don’t ignore them collectively only because they have directive pronouns (e.g., “this”, “that”, “the”).
○ Step 3. Summarize only sentences about core concepts
○ Step 4. Allocation of the conccepts by field : law, economy, science, etc.
○ Step 5. Connect to specific posting in each field : Posting is often integrated within each field.
○ Step 6. Connect the specific number of lines in each post : Insert the corresponding core knowledge at that point if there is no duplication.
○ Interim conclusion : Again, the knowledge system has integrity, non-redundancy, unity, and brevity.
○ Final conclusion : The knowledge system has integrity, non-redundantness, unity, and brevity forever (Mathematical induction)
○ The scheme for the model that crawls knowledge on its own : The model of self-crawling knowledge is simply a paradigm of modifying and expanding concepts by biting a tail. I think this method is similar to human cognition. In addition, the backpropagation algorithm, a key principle of deep learning algorithm, should be applied. Based on this insight, I believe that humans use deep learning not only in the sensory systems but also in the central nervous system.
○ Method 2. An idea to back up postings periodically through crawling (link) (complete)
○ Method 3. An idea for LLM-based AI assistant to automatically reflect on foreign language platforms hosted on GitHub as soon as certain posts are newly created or updated (in progress) : Refer to the translation project here.
○ Method 4. Reviewer AI, a model that reviews content for errors and typos using crawling and LLM (in progress)
○ ✅ Find all posts that include “font-size: 14pt;” in the first line of the posting.
○ ✅ Find all posts with typos (e.g., cases where only a single Korean character is present).
○ ✅ Find all hyperlinks that need to be updated due to changes in the referenced posts.
○ ✅ Find all hyperlinks that link to a posting in general but should link to a more specific part.
○ ✅ In code text, find instances where ‘white-space: pre’ is not present in the style.
○ ✅ Find mismatch points between Korean Tistory and English GitHub.
○ ✅ Identify a model that periodically points out content errors using ChatGPT API or LLaMa2.
○ ✅ Download all images and attachments
○ ✅ Find all posts with “[Table of Contents]” in bold
○ ✅ Locate unnecessary
... tags
○ Method 5. Overall posting management DB : ID, title, content, comments, links, backlinks, recent modifications → Based on LLM, the link between posts can be calculated to secure a hierarchy of the current blog’s knowledge system (ref) (in progress)
○ Method 6. Artificial Intelligence Secretary who manages files with LLM : In the future, LLM is expected to develop into such an artificial intelligence secretary-related project. (not started)
2-4. Crowd-sourcing (in progress) : SNS, NLP
○ Describes another way for individuals to gain strength: how to introduce crowdsourcing to content acquisition.
○ Method 1. User Behavior Information (complete) : Implementation possible in the form of videos using Microsoft’s Clarity
○ Possible to know which parts of a given post are important through the collective intelligence of users → Summary generation possible
○ Possible to know which posts are popular in different countries
○ Possible to know if the intended interactive web function is being used properly
○ Possible to know which posts are active in terms of user behavior such as duration of stay
○ Possible to know where Javascript errors occur
○ Individual user tracking
○ Real-time user identification
○ Method 2. Q&A of the blog
○ Many readers have already provided useful questions and materials. However, in general, I can see that both my readers and I can communicate a lot of information through questions. It is different from conventional Q&A services in that 1) it is difficult to ask around as an adult, and 2) I may be able to answer as soon as possible (compared to Naver Knowledge-in service).
○ Method 3. Chatbot
○ Earlier, I revealed the idea of collecting core concepts because 1) core concepts exist in a small proportion, so it is possible to collect them one by one with human effort, and 2) core knowledge is too abstract for artificial intelligence to access. However, ChatGPT can tell us all the details once we define a core concept, so the combination of ChatGPT and core concept aggregation can suggest the possibility of drawing a blueprint for all human knowledge.
○ I have implemented a chatbot through OpenAI’s ChatGPT Store. (app, ref, YouTube)
○ I have also tried implementing a chatbot using Meta’s on-premise LLMs like LLaMa, LLaMa2, and mistral. (ollama)
○ Method 4. Actively Promote Content: Twitter, Tistory Forums, YouTube, Naver Knowledge-in, etc.
4. Initiative 3. Image generation plan (in progress)
3-1. 3D Images as contents (not started)
○ Some of the images posted on this blog will be reborn as 3D images. The era has finally come when it is possible to recreate 3D images through multiple 2D images through AI algorithms.
3-2 Resolution of copyright disputes (complete)
○ In addition, the era has come when it is possible to reconstruct the existing image into the style one wants. Through this, it is expected to post a more appropriate image and resolve much of the copyright-related disputes.
○ The current generative model DALL∙E is the most successful, so I want to actively utilize it. It can even generate appropriate images that match the given posting content, so a considerable synergy with this blog is expected. However, there are still many typos found in the letters within the images, so this part seems imperfect.
3-3. Image restoration (not started)
○ If the above discussion is extended further, it may be difficult for some images, regardless of copyright, to have watermarks or unnecessary characters, even though their quality is quite good. In this case, it will be possible to restore the part one wants by utilizing the deep image prior (DIP).
3-4. Copyright detour strategy through image watermark development (not started)
○ According to Article 102 Subparagraph 4 of the Copyright Act, “an act of making it possible to know the location of information and communication network works through information search tools” is not considered a copyright infringement. As a result, search engine platforms such as Google and Naver will be able to freely display images from copyright infringement. If I develop a technology that automatically watermarks images by displaying links to the original text of images and downloading them, uploading images to blogs can be free from copyright law.
5. Initiative 4. Translation project (complete)
○ Overview
○ Objective 1. Increase in Foreign User Influx: There are 70 million Korean language users worldwide. Considering the global population of 7 billion, to fully capture the benefits of the 100-fold difference, a translation project seems essential. I am planning a project where if a post in a particular language is updated in real-time, all other language versions of the post are updated as well.
○ Objective 2. Reduce Dependence on Tistory: I have been pondering a lot after observing the recent unilateral policies of Tistory.
○ Objective 3. Emphasizing Differentiation Through Technological Superiority: By introducing LLM and other technical differentiations, I aim to strengthen my capabilities and improve the satisfaction of the readers.
○ Chinese translation materials (in preparation)
○ Method 1. Use of Naver Papago (rejected)
○ The translation of postings can be done through five steps : html parsing → preprocessing → Naver Papago interworking → post-processing → simple correction of translation.
○ weakness : Laborious.
○ Method 2. Google Translater API, Crawling (rejected)
○ weakness : Difficult to use and translation quality is too low.
○ Method 3. Python web crawling → Markdown conversion → ChatGPT API → GitHub commit (complete) (ref)
○ Python web crawling: Using the Python ‘requests’ library.
○ Markdown conversion: Using the ‘BeautifulSoup’ library.
○ ChatGPT API: If the instruction becomes too lengthy, there’s a decline in quality, necessitating text preprocessing.
○ Method 4. Python web crawling → Markdown conversion → LLaMa2 API → GitHub commit (in progress)
○ Advantage : It’s installed, so it’s free and has a reproducibility. Enables more efficient fine-tuning. ollama makes the use of llama2 a lot easier.
○ Remark : Using the LLaMa2 API key also incurs costs (Replicate)
○ Policy 1: Do not produce overly perfect translations.
○ Reason 1. A too-perfect translation system can promote the leakage of domestic intellectual property abroad.
○ Reason 2. While spending a lot of energy on the original site, it can reduce the inflow of the original Korean site itself
○ Policy 2: Considering a system where editing on the original site leads to simultaneous changes across all foreign language sites.
○ Other foreign language sites: Either established or planned to be established on GitHub. This includes languages like English, Japanese, Spanish, etc.
○ Step 1: Edit on the original site (e.g., Tistory).
○ Step 2: Javascript recognizes the editing event and forwards it to an AI assistant based on LLaMa2.
○ Step 3: The LLaMa2-based AI assistant inputs commands into the internal system, executing Git Clone and Git Commit.
○ Step 4: All other GitHub foreign language sites are uniformly updated.
6. Initiative 5. Publication project (complete)
○ Objective 1. Growth of education market
The current market conditions are remarkable. Southeast Asia and India have a large population of young people, but their educational level is very low. The demand for education in these regions is exploding, and in 10 years it will be a really remarkable market. Also, let’s look at Africa. It is still undervalued, but the education market in 30 to 50 years is drawing attention. It is already well known that there are many negative factors in the domestic education market due to a lack of school-age population, and in the end, it would be the right direction to turn to overseas. Nevertheless, there is a great lack of means to look at market conditions. Therefore, I have adopted the publication of books as an indicator product to understand the atmosphere of the overseas education market.
○ Objective 2. Rediscovering the possibility of publication
There is a part of the resume (curriculum vitae, CV) that says Honor, literally honorable performance. To list such examples, there must be language, GPA, volunteer work, exchange students, and prize winners, among which papers and patents are the most powerful. In order to write a thesis, you must meet certain requirements under the professors who hold the academy, and to write a patent requires knowledge of patents, invention completion, and considerable cost. There is a third honor on par with papers and patents; that is publishing. I think more people don’t know this part than I thought. In the future, the paper market will decline and the electronic publishing market will rise. It’s amazing that anyone can publish through the endless growth potential of the electronic publishing market and easy electronic publishing.
○ Stategy 1. Partnership with Festbook media
By rediscovering the growth of the education market and the possibility of publishing, I sought out e-publishing partners and proceeded with publication through Festbook Media. It’s not an exclusive contract, but I think it’s a good condition, so I’m looking forward to signing a number of publishing contracts (~100 volumes).
○ Strategy 2. Establishing a publishing company
○ In the era of AI and platforms, I am planning to establish a publishing company to expand opportunities for writers.
○ A publishing company named ‘Izgrimin’ is established (23.06.27).
7. Initiative 6. Apps on a blog (in progress)
6-1. Social statistics (not started)
○ My curiosity is also connected with social statistics. There is a limit to the literature to understand that. Therefore, I plan to activate the Quick Toolbar pop-up window to proceed with the survey. Besides, there’s nothing much to do with the voting system. According to Mecalf’s law, the value of a network is proportional to the squared nodes. I think that theory can also be applied to YouTube and any websites. I’d like to confirm that.
6-2. Knowledge should be tree (not started)
○ Wikipedia is a list, but knowledge must be sorted into trees. What if all knowledge is ordered? The knowledge that follows can only be explained by the knowledge in front. In that way, I can order all knowledge and create artificial intelligence that determines how many times that knowledge comes when new knowledge comes. That knowledge will automatically be placed in the right place, so that the machine can understand the new knowledge. If you have a sentence that says “A consists of B and C,” you can place it right after the last of A, B, and C.
○ Can’t we use a large language model (LLM) for this task?
6-3. Automatically trimming html codes (complete)
○ The Python BeautifulSoup library can efficiently handle the task.
○ When translating Korean in the html code with Chat GPT, it was confirmed that the html code was also trimmed.
6-4. Automatically generating articles (complete)
○ Python allows us to automatically generate html-formatted articles. (ref) I’m actually using it in my field of bioinformatics.
○ Here, I will use it to automatically create the reports for stock trends, public opinion patterns, weather conditions, etc.
6-5. Image in-link (not started)
○ Example (www.creative-diagnostics.com)
6-6. Simple apps on a blog (in progress)
○ First of all, I plan to install a simple app on my blog.
✓ Various types of converters: USD ↔︎ KRW, Decimal ↔︎ Hexadecimal, Time Zone Conversion, Meter ↔︎ Feet, Kg ↔︎ Pound, Celsius ↔︎ Fahrenheit
✗ A program that automatically shuffles tarot cards (planned)
✗ Web-based MBTI test (planned)
✗ A program that explains history on a 4D map (planned)
✗ Program for calculating the timing requirements of patents (planned)
✗ UI for LLM (e.g., ChatGPT) models via toolbar (planned)
✗ Plastic Surgery Simulator (planned)
✗ Free Bulletin Board (planned)
✗ Sudoku Puzzle Generator (Planned)
6-7. Database on a Blog (not started)
○ Method 1. It is not difficult at all if you install the dataset on the NAS and turn the corresponding url over to the blog.
○ Method 2. I’m seriously thinking about GitHub.
○ However, it is a question of which dataset to share.
○ Currently, the focus is on biological information data sets such as TCGA and GEO : A good example of UCSC Xena.
6-8. How to present the blueprint of human knowledge in tSNE (complete)
○ Direction 1. https://artsexperiments.withgoogle.com/tsnemap/
○ Direction 2. Illustration of papers in tSNE with language models (e.g., PubMedBERT)
○ Completed Project: Blog Post Embedding Using a Sentence Embedder
8. Initiative 7. Human-computer interface (HCI) (not started)
7-1 Acquisition of knowledge
○ What if you could organize all your knowledge? What if you could put VR devices into practical use? What if you could analyze brain waves through deep learning? You will be able to make a device that acquires knowledge just by wearing it on your head like a hat.
7-2. Generation of knowledge
○ If the above discussion is expanded, it will be possible to create a device that secures the person’s knowledge system just by wearing it on the head. Wouldn’t it be possible to realize human immortality or human computerization?
7-3. Reorganization of human personality
○ If you have an encyclopedia from anyone, you can track the content that a particular individual owns and implement the schema in that person’s head. (ref)
7-4. Spatiotemporal Omics
○ If Spatiotemporal Omics is implemented, it can be interpreted as a data-interpreting foundational model, which would enable the conversion from humans to computers.
9. Initiative 8. Research on the nature of knowledge (in progress)
10. Closing
I want to build a Tower of Babel consisting of Knowledge.
I will climb to the top of the world and wait for you there.
Input: 2019.11.01. 20:52
Revised: 2023.06.30 13:25