listener/QWEN.md

172 lines
5.0 KiB
Markdown
Raw Permalink Normal View History

2026-01-09 22:52:33 +08:00
# ESP32 I2S Audio Processing Project
## Project Overview
This is an ESP32-based audio processing project that implements I2S (Inter-IC Sound) communication for capturing audio data using an ES7210 codec. The project includes functionality for:
- Audio recording to SD card in WAV format
- Automatic Speech Recognition (ASR) integration with cloud APIs
- WiFi connectivity for cloud services
- USB Mass Storage (MSC) for file access
- Power management and sleep modes
- Time synchronization via NTP
The project is built using the ESP-IDF (Espressif IoT Development Framework) and targets audio applications such as voice recorders, speech recognition systems, or audio processing units.
## Key Features
### Audio Recording
- I2S TDM (Time Division Multiplexing) interface with ES7210 codec
- Configurable sample rate (16kHz), bit width (16-bit), and stereo channels
- Direct Memory Access (DMA) for efficient audio data transfer
- WAV file format with proper headers
- Timestamp-based file naming for recordings
### Connectivity
- WiFi station mode for internet connectivity
- SD card mounting via SDMMC interface
- NTP time synchronization with multiple servers
- HTTP client for cloud API communication
### Cloud Integration
- ASR (Automatic Speech Recognition) using SiliconFlow API
- Audio transcription capabilities
- Secure HTTPS communication with bearer token authentication
### Power Management
- Dynamic CPU frequency scaling (8MHz to 240MHz)
- Light sleep mode for power conservation
- Automatic power management configuration
### USB Interface
- USB Mass Storage Class (MSC) implementation
- SD card access via USB when connected to host
- Console commands for file system operations
## Hardware Configuration
### I2S Interface
- Master clock (MCK): GPIO 38
- Bit clock (BCK): GPIO 14
- Word select (WS): GPIO 13
- Data in (DI): GPIO 12
### ES7210 Codec
- I2C interface: Port 0
- SDA: GPIO 1
- SCL: GPIO 2
- I2C address: 0x41
### SD Card Interface
- Command: GPIO 48
- Clock: GPIO 47
- Data 0: GPIO 21
- Mount point: `/sdcard`
### Additional Pins
- LED indicator: GPIO 48
## Building and Running
### Prerequisites
- ESP-IDF v4.x or later installed and configured
- ESP32 development board
- ES7210 I2S audio codec
- SD card
### Build Process
```bash
# Navigate to project directory
cd /path/to/esp32i2s
# Configure project (optional, if needed)
idf.py menuconfig
# Build the project
idf.py build
# Flash to ESP32
idf.py flash
# Monitor serial output
idf.py monitor
```
### Configuration Options
- Sample rate: 16000 Hz
- Channel count: 2 (Stereo)
- Bit width: 16 bits
- Recording duration: 20 seconds per file
- DMA buffer count: 8
- DMA buffer length: 512 samples
## File Structure
```
esp32i2s/
├── CMakeLists.txt # Main build configuration
├── partitions.csv # Partition table
├── sdkconfig # SDK configuration
├── sdkconfig.defaults # Default SDK settings
├── main/ # Main application source
│ ├── main.c # Main application entry point
│ ├── record.c/h # Audio recording functionality
│ ├── asr.c/h # Automatic Speech Recognition
│ ├── base.h # Common definitions and includes
│ ├── format_wav.h # WAV file format definitions
│ ├── usb_msc.c # USB Mass Storage implementation
│ └── ...
└── ...
```
## Development Conventions
### Coding Style
- Follow ESP-IDF coding conventions
- Use ESP_LOG macros for logging
- Handle errors with ESP_ERROR_CHECK and ESP_RETURN_ON_FALSE
- Use FreeRTOS tasks for concurrent operations
### Memory Management
- Use DMA-capable memory allocation for I2S buffers
- Properly free allocated memory in error paths
- Monitor heap usage for memory leaks
### Power Efficiency
- Implement sleep modes when idle
- Use power management configuration appropriately
- Minimize active periods for battery operation
## Testing and Debugging
### Serial Monitor
Monitor the serial output for logging information:
- I2S initialization status
- SD card mounting results
- WiFi connection status
- Recording progress
- ASR API responses
### Console Commands
When USB MSC is active, the following console commands are available:
- `read` - Read README.MD file
- `write` - Create/update README.MD file
- `size` - Show storage capacity
- `expose` - Expose storage to USB host
- `status` - Show storage exposure status
- `exit` - Exit application
## Security Considerations
- WiFi credentials are hardcoded in `asr.c` (WIFI_SSID and WIFI_PASS)
- API token is hardcoded in `asr.c` (BEARER_TOKEN)
- HTTPS communication uses certificate validation
- Consider using NVS for storing sensitive information instead of hardcoding
## Known Issues and Limitations
- WiFi credentials and API tokens are hardcoded in source
- Recording duration is fixed at 20 seconds
- Only supports PCM WAV format
- Requires internet connectivity for ASR functionality
- USB MSC and file system access cannot be used simultaneously