我的 Mac 本地端 Python LLM 測試環境設定


Mac local demo directory

A well-structured Python project

Scaffold

Creating a well-structured Python project file structure is key to maintaining a clean, manageable, and scalable codebase. Here’s a sample structure for a moderate-sized Python project. This structure can be adapted based on the specific needs of your project:

my_project/
│
├── docs/                   # Documentation files
│
├── src/                    # Source files
│   ├── __init__.py         # Makes src a Python package
│   ├── main.py             # Entry point of the project
│   └── module_name/        # A sample module (can be multiple)
│       ├── __init__.py     # Makes module_name a Python package
│       ├── class.py        # Sample class file
│       └── utils.py        # Utility functions specific to this module
│
├── tests/                  # Automated tests
│   ├── __init__.py         # Makes tests a Python package
│   └── test_module.py      # Test file for module_name
│
├── venv/                   # Virtual environment (usually not version-controlled)
│
├── setup.py                # Build script for setuptools
│
├── requirements.txt        # Production dependencies
│
├── requirements_dev.txt    # Development dependencies
│
├── .gitignore              # Specifies intentionally untracked files to ignore
│
├── README.md               # Project overview, setup, and usage
│
└── LICENSE                 # Legal information about the project

Explanation of Directories and Files

  • docs/: Contains documentation files, like project documentation, API docs, etc.
  • src/: Contains all the source code of your project. Larger projects can have their source split into various modules within this directory.
  • tests/: Contains test cases. This is crucial for any sizable project to ensure code reliability and maintainability.
  • venv/: The directory where your Python virtual environment resides. It’s generally recommended to keep it outside of version control (e.g., Git).
  • setup.py: A script for setting up the project. It’s used to install your project as a package using pip. Useful if you’re planning to distribute your project.
  • requirements.txt: Lists the Python dependencies necessary for your project to run. These can be installed using pip install -r requirements.txt.
  • requirements_dev.txt: Similar to requirements.txt, but for development dependencies (like linters, testing frameworks).
  • .gitignore: Lists files and directories that should be ignored by Git. Typically includes venv/, __pycache__/, and other non-source-code files.
  • README.md: A Markdown file containing an introduction to your project, instructions on how to install and use it, and other essential information.
  • LICENSE: Contains the license details for your project, specifying how others can use your code.
/Users/liushihyen/wwwdata/python_venv

To see all versions of Python installed in typical directories, use ls:

$ ls /usr/bin/python* /usr/local/bin/python*

The which command shows the path to the Python executables that are in your system’s PATH. To list Python versions, use:

$ which -a python python2 python3

To install a specific version of Python, first search for available Python versions:

$ brew search python@

Then install the desired version, for example:

$ brew install python@3.9

If you installed Python using Homebrew, you can list all installed Python versions with the following command:

$ brew list --versions | grep python

Run the following command, replacing 3.x with your specific Python version and env_name with your desired environment name:

$ python3.x -m venv {{Python virtual environment name here}}

To activate the virtual environment, run:

$ source {{Python virtual environment name here}}/bin/activate

When you’re done working in the virtual environment, deactivate it by running:

$ deactivate

Check the Current Environment

echo $VIRTUAL_ENV

Using which python or which python3: You can also check which Python interpreter is currently being used by your shell session. This can give you an idea of whether you are using the system Python or one from a virtual environment. Use the command:

which python

Using virtualenv

If you don’t have virtualenv installed, install it using pip:

$ pip install virtualenv

Create a Virtual Environment with a Specific Python Version:

$ virtualenv -p python3.9 demo_virtualenv_python3.9

Activation and deactivation of the virtual environment are the same as with venv:

$ source ./demo_virtualenv_python3.9/bin/activate


pip

Upgrade pip

$ python -m pip install --upgrade pip

List Installed Packages

$ pip list

Or, if you want more detailed information, including where each package is installed:

$ pip freeze

Save Requirements: You can save the list of installed packages to a file, which is useful for replicating the environment elsewhere. Run:

$ pip freeze > requirements.txt

This requirements.txt file can then be used to install the same packages in another environment using:

$ pip install -r requirements.txt

Check Specific Package Version: If you’re looking for a specific package, you can filter the output using grep on Unix-based systems, or the findstr command on Windows. For example:

$ pip list | grep openai

Install IPython and Jupyter Notebook

$ pip install ipython jupyter

Run Jupyter Notebook

$ jupyter notebook


Install Jupyter Notebook: With the virtual environment activated, install Jupyter Notebook using pip

$ pip install notebook
$ jupyter notebook

Install LLM related Python modules

$ pip install openai langchain llama-index

Getting the Current Working Directory

current_directory = os.getcwd()
print("Current Directory:", current_directory)

Llamaindex Hello World!

在 Python 實務上我們不會把應用程式建立在 venv 的資料夾內,所以我會在專案的根目錄下的 src 資料匣中建立一個名為 llamaindex_demo 的目錄,並在裡面建立一個名為 hello_world.py 的程式檔,然後在 data 資料匣內放置要用來測試的純文字資料檔,你可以放置任意多個 .txt 檔案在裡面,樹狀資料結構大致如下圖。

./src/
├── langchain_demo
└── llamaindex_demo
    ├── data
    │   └── paul_graham_essay.txt
    ├── starter.py
    └── storage
        ├── default__vector_store.json
        ├── docstore.json
        ├── graph_store.json
        ├── image__vector_store.json
        └── index_store.json

以下簡短的程式碼是簡單的去示範如何用 Llamaindex 讀取 doc 資料匣內的所有 .txt 文檔的內容,然後根據這些文檔的內容,透過 GPT-4 model 來推論該怎麼去正確的生成 What did the author do growing up? 這一個問題的回應。

在執行之前要記得先用環境變數設定好 OpenAI 的 API Key

export OPENAI_API_KEY={{Place your OpenAI API Key here}}

然後建立一個名為 starter.py 的檔案並在檔案內寫入以下的 Python 程式碼。

from llama_index import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)

query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")
print(response)

執行無誤的話 CLI 應該會顯示類似下方區塊的內容。

The author wrote short stories and also worked on programming, specifically on an IBM 1401 computer in 9th grade.