5.0 KiB

Raw Blame History

ESP32 I2S Audio Processing Project

Project Overview

This is an ESP32-based audio processing project that implements I2S (Inter-IC Sound) communication for capturing audio data using an ES7210 codec. The project includes functionality for:

Audio recording to SD card in WAV format
Automatic Speech Recognition (ASR) integration with cloud APIs
WiFi connectivity for cloud services
USB Mass Storage (MSC) for file access
Power management and sleep modes
Time synchronization via NTP

The project is built using the ESP-IDF (Espressif IoT Development Framework) and targets audio applications such as voice recorders, speech recognition systems, or audio processing units.

Key Features

Audio Recording

I2S TDM (Time Division Multiplexing) interface with ES7210 codec
Configurable sample rate (16kHz), bit width (16-bit), and stereo channels
Direct Memory Access (DMA) for efficient audio data transfer
WAV file format with proper headers
Timestamp-based file naming for recordings

Connectivity

WiFi station mode for internet connectivity
SD card mounting via SDMMC interface
NTP time synchronization with multiple servers
HTTP client for cloud API communication

Cloud Integration

ASR (Automatic Speech Recognition) using SiliconFlow API
Audio transcription capabilities
Secure HTTPS communication with bearer token authentication

Power Management

Dynamic CPU frequency scaling (8MHz to 240MHz)
Light sleep mode for power conservation
Automatic power management configuration

USB Interface

USB Mass Storage Class (MSC) implementation
SD card access via USB when connected to host
Console commands for file system operations

Hardware Configuration

I2S Interface

Master clock (MCK): GPIO 38
Bit clock (BCK): GPIO 14
Word select (WS): GPIO 13
Data in (DI): GPIO 12

ES7210 Codec

I2C interface: Port 0
SDA: GPIO 1
SCL: GPIO 2
I2C address: 0x41

SD Card Interface

Command: GPIO 48
Clock: GPIO 47
Data 0: GPIO 21
Mount point: /sdcard

Additional Pins

LED indicator: GPIO 48

Building and Running

Prerequisites

ESP-IDF v4.x or later installed and configured
ESP32 development board
ES7210 I2S audio codec
SD card

Build Process

# Navigate to project directory
cd /path/to/esp32i2s

# Configure project (optional, if needed)
idf.py menuconfig

# Build the project
idf.py build

# Flash to ESP32
idf.py flash

# Monitor serial output
idf.py monitor

Configuration Options

Sample rate: 16000 Hz
Channel count: 2 (Stereo)
Bit width: 16 bits
Recording duration: 20 seconds per file
DMA buffer count: 8
DMA buffer length: 512 samples

File Structure

esp32i2s/
├── CMakeLists.txt          # Main build configuration
├── partitions.csv          # Partition table
├── sdkconfig              # SDK configuration
├── sdkconfig.defaults     # Default SDK settings
├── main/                  # Main application source
│   ├── main.c             # Main application entry point
│   ├── record.c/h         # Audio recording functionality
│   ├── asr.c/h            # Automatic Speech Recognition
│   ├── base.h             # Common definitions and includes
│   ├── format_wav.h       # WAV file format definitions
│   ├── usb_msc.c          # USB Mass Storage implementation
│   └── ...
└── ...

Development Conventions

Coding Style

Follow ESP-IDF coding conventions
Use ESP_LOG macros for logging
Handle errors with ESP_ERROR_CHECK and ESP_RETURN_ON_FALSE
Use FreeRTOS tasks for concurrent operations

Memory Management

Use DMA-capable memory allocation for I2S buffers
Properly free allocated memory in error paths
Monitor heap usage for memory leaks

Power Efficiency

Implement sleep modes when idle
Use power management configuration appropriately
Minimize active periods for battery operation

Testing and Debugging

Serial Monitor

Monitor the serial output for logging information:

I2S initialization status
SD card mounting results
WiFi connection status
Recording progress
ASR API responses

Console Commands

When USB MSC is active, the following console commands are available:

read - Read README.MD file
write - Create/update README.MD file
size - Show storage capacity
expose - Expose storage to USB host
status - Show storage exposure status
exit - Exit application

Security Considerations

WiFi credentials are hardcoded in asr.c (WIFI_SSID and WIFI_PASS)
API token is hardcoded in asr.c (BEARER_TOKEN)
HTTPS communication uses certificate validation
Consider using NVS for storing sensitive information instead of hardcoding

Known Issues and Limitations

WiFi credentials and API tokens are hardcoded in source
Recording duration is fixed at 20 seconds
Only supports PCM WAV format
Requires internet connectivity for ASR functionality
USB MSC and file system access cannot be used simultaneously

5.0 KiB Raw Blame History