# ESP32 I2S Audio Processing Project ## Project Overview This is an ESP32-based audio processing project that implements I2S (Inter-IC Sound) communication for capturing audio data using an ES7210 codec. The project includes functionality for: - Audio recording to SD card in WAV format - Automatic Speech Recognition (ASR) integration with cloud APIs - WiFi connectivity for cloud services - USB Mass Storage (MSC) for file access - Power management and sleep modes - Time synchronization via NTP The project is built using the ESP-IDF (Espressif IoT Development Framework) and targets audio applications such as voice recorders, speech recognition systems, or audio processing units. ## Key Features ### Audio Recording - I2S TDM (Time Division Multiplexing) interface with ES7210 codec - Configurable sample rate (16kHz), bit width (16-bit), and stereo channels - Direct Memory Access (DMA) for efficient audio data transfer - WAV file format with proper headers - Timestamp-based file naming for recordings ### Connectivity - WiFi station mode for internet connectivity - SD card mounting via SDMMC interface - NTP time synchronization with multiple servers - HTTP client for cloud API communication ### Cloud Integration - ASR (Automatic Speech Recognition) using SiliconFlow API - Audio transcription capabilities - Secure HTTPS communication with bearer token authentication ### Power Management - Dynamic CPU frequency scaling (8MHz to 240MHz) - Light sleep mode for power conservation - Automatic power management configuration ### USB Interface - USB Mass Storage Class (MSC) implementation - SD card access via USB when connected to host - Console commands for file system operations ## Hardware Configuration ### I2S Interface - Master clock (MCK): GPIO 38 - Bit clock (BCK): GPIO 14 - Word select (WS): GPIO 13 - Data in (DI): GPIO 12 ### ES7210 Codec - I2C interface: Port 0 - SDA: GPIO 1 - SCL: GPIO 2 - I2C address: 0x41 ### SD Card Interface - Command: GPIO 48 - Clock: GPIO 47 - Data 0: GPIO 21 - Mount point: `/sdcard` ### Additional Pins - LED indicator: GPIO 48 ## Building and Running ### Prerequisites - ESP-IDF v4.x or later installed and configured - ESP32 development board - ES7210 I2S audio codec - SD card ### Build Process ```bash # Navigate to project directory cd /path/to/esp32i2s # Configure project (optional, if needed) idf.py menuconfig # Build the project idf.py build # Flash to ESP32 idf.py flash # Monitor serial output idf.py monitor ``` ### Configuration Options - Sample rate: 16000 Hz - Channel count: 2 (Stereo) - Bit width: 16 bits - Recording duration: 20 seconds per file - DMA buffer count: 8 - DMA buffer length: 512 samples ## File Structure ``` esp32i2s/ ├── CMakeLists.txt # Main build configuration ├── partitions.csv # Partition table ├── sdkconfig # SDK configuration ├── sdkconfig.defaults # Default SDK settings ├── main/ # Main application source │ ├── main.c # Main application entry point │ ├── record.c/h # Audio recording functionality │ ├── asr.c/h # Automatic Speech Recognition │ ├── base.h # Common definitions and includes │ ├── format_wav.h # WAV file format definitions │ ├── usb_msc.c # USB Mass Storage implementation │ └── ... └── ... ``` ## Development Conventions ### Coding Style - Follow ESP-IDF coding conventions - Use ESP_LOG macros for logging - Handle errors with ESP_ERROR_CHECK and ESP_RETURN_ON_FALSE - Use FreeRTOS tasks for concurrent operations ### Memory Management - Use DMA-capable memory allocation for I2S buffers - Properly free allocated memory in error paths - Monitor heap usage for memory leaks ### Power Efficiency - Implement sleep modes when idle - Use power management configuration appropriately - Minimize active periods for battery operation ## Testing and Debugging ### Serial Monitor Monitor the serial output for logging information: - I2S initialization status - SD card mounting results - WiFi connection status - Recording progress - ASR API responses ### Console Commands When USB MSC is active, the following console commands are available: - `read` - Read README.MD file - `write` - Create/update README.MD file - `size` - Show storage capacity - `expose` - Expose storage to USB host - `status` - Show storage exposure status - `exit` - Exit application ## Security Considerations - WiFi credentials are hardcoded in `asr.c` (WIFI_SSID and WIFI_PASS) - API token is hardcoded in `asr.c` (BEARER_TOKEN) - HTTPS communication uses certificate validation - Consider using NVS for storing sensitive information instead of hardcoding ## Known Issues and Limitations - WiFi credentials and API tokens are hardcoded in source - Recording duration is fixed at 20 seconds - Only supports PCM WAV format - Requires internet connectivity for ASR functionality - USB MSC and file system access cannot be used simultaneously