verticallines

Loading...

Badger Vision

Rahul Hathwar - 2025-01-01 - Prize Winner at Hack Midwest 2024 - AI-Powered Facial Recognition for the Visually Impaired

Project Overview

Prize winner at Hack Midwest 2024. Badger Vision is a web application that utilizes Zoom's real-time video transmission features to help identify faces for the visually impaired. By combining computer vision, machine learning, and accessible interface design, this project provides real-time audio descriptions of people in the user's field of view.

Tech Stack:

  • React + TypeScript (frontend)
  • Python (computer vision backend)
  • Zoom Video SDK
  • Pinata (IPFS storage)
  • Face recognition ML models
  • Text-to-speech synthesis

Achievement:

  • šŸ† Prize Winner at Hack Midwest 2024
  • Built in 24 hours
  • Working prototype with live video demo
  • Addresses critical accessibility need

The Problem

Social Isolation for the Visually Impaired

Visually impaired individuals face significant challenges in social settings:

  • Cannot identify who is speaking or entering a room
  • Miss nonverbal cues like facial expressions
  • Difficulty navigating crowded spaces
  • Social anxiety from not recognizing familiar faces
  • Professional challenges in meetings and networking

Our Solution: Real-time facial recognition with audio feedback, allowing users to "see" through sound who is around them.


Technical Architecture

System Flow

ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”      ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”      ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”
│  Zoom Video │ ───> │  React App   │ ───> │ Python Backend │
│     SDK     │      │  (Frontend)  │      │  (CV/ML)       │
ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜      ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜      ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜
                            │                        │
                            │                        │
                            v                        v
                    ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”        ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”
                    │ Text-to-     │        │  Face         │
                    │ Speech       │        │  Recognition  │
                    ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜        │  Models       │
                                            ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜
                                                     │
                                                     v
                                            ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”
                                            │  Pinata/IPFS  │
                                            │  (Storage)    │
                                            ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜

Component Breakdown

Frontend (React + TypeScript):

  • User interface for setup and configuration
  • Zoom Video SDK integration
  • Frame capture and transmission
  • Audio feedback controls
  • Face database management UI

Backend (Python):

  • Face detection (face_recognition library)
  • Face encoding and matching
  • API endpoints for frame analysis
  • Database queries
  • Performance optimization

Storage (Pinata/IPFS):

  • Face photo storage
  • Face encoding metadata
  • User preferences
  • Usage analytics

Hackathon Development

36-Hour Timeline

Hour 0-4: Research & Planning

  • Identified accessibility gap
  • Researched face recognition libraries
  • Explored Zoom SDK capabilities
  • Defined MVP feature set
  • Set up development environment

Hour 4-16: Core Implementation

  • Implemented face recognition system
  • Integrated Zoom Video SDK
  • Built React UI components
  • Created API endpoints
  • Tested basic face detection

Hour 16-28: Integration & Features

  • Connected frontend to backend
  • Implemented audio feedback
  • Added Pinata/IPFS storage
  • Performance optimization
  • Cross-browser testing

Hour 28-36: Polish & Demo

  • UI refinements
  • Edge case handling
  • Demo preparation
  • Presentation creation
  • Live testing with judges

Future Enhancements

Technical Improvements:

  1. Mobile app: Native iOS/Android versions
  2. Wearable integration: Smart glasses with camera
  3. Emotion detection: Identify facial expressions
  4. Gesture recognition: Detect waving, pointing
  5. Object detection: Identify obstacles, not just people
  6. Multi-language: Support for international users

Feature Additions:

  1. Social context: Remember where you met people
  2. Conversation history: Recall previous interactions
  3. Group dynamics: Understand social groupings
  4. Accessibility modes: Different levels of detail
  5. Offline mode: Pre-loaded faces for common locations

Conclusion

Badger Vision demonstrates that emerging technologies can be harnessed to solve fundamental accessibility challenges. By combining computer vision, real-time video streaming, and thoughtful interface design, we created a tool that could transform daily life for visually impaired individuals.

This project showcases:

  • Cross-disciplinary skills: ML, web development, accessibility design
  • Rapid prototyping: 36 hours to working product
  • Social awareness: Engineering for inclusivity
  • Technical integration: Multiple complex systems working together
  • Competitive success: Prize at major hackathon

More importantly, it reinforces that technology should serve human needs. The best innovations don't just showcase technical prowess, they make life meaningfully better for real people. Badger Vision aspires to do both.

Copyright Ā© Rahul Hathwar. All Rights Reserved.