Skip to content

System Overview

VISION has three parts that work together seamlessly:

┌────────────────────┐ Cloud Sync ┌────────────────────┐
│ │ ◄──────────────────────► │ │
│ SMART GLASSES │ │ MOBILE APP │
│ (Worn by VIP) │ │ (Android) │
│ │ │ │
│ • Voice-controlled│ │ • VIP interface │
│ • AI vision │ │ • Caretaker view │
│ • Navigation │ │ • Chat & sharing │
│ • Haptic feedback │ │ • Live monitoring │
│ │ │ │
└────────────────────┘ └────────────────────┘

The Smart Glasses

The glasses are the VIP’s eyes and ears. They combine cameras, sensors, and on-device AI to help the user understand and navigate their surroundings — all controlled by voice.

Hardware features

  • Dual cameras — one for general vision (object detection, navigation, document scanning), one dedicated to face recognition
  • Ultrasonic distance sensing — for obstacle detection with distance awareness
  • GPS — for navigation and location tracking
  • Head position sensor — gentle reminders to keep your head level
  • Physical button — push-to-talk plus tap controls (skip, cancel)
  • Vibration motors — haptic feedback for alerts
  • Lights — with auto-brightness adjustment
  • Bluetooth + WiFi — for pairing and cloud connectivity

What runs on the glasses

  • Voice command processing and voice response
  • Real-time face recognition (multiple faces at once)
  • Obstacle detection with priority-based warnings
  • Object recognition for scene description
  • GPS-based navigation and location logging
  • Document scanning with AI-powered text extraction and summarization

The Mobile App

A companion Android app for VIPs and their caretakers. Every action on the glasses is reflected here, and vice versa.

  • VIP mode — manage contacts (faces), review history, customize your glasses, chat with caretakers
  • Caretaker mode — monitor the VIP’s live navigation, review detected faces and locations, chat and share content

Cloud Connectivity

The glasses and mobile app don’t connect directly over Bluetooth for day-to-day use. Instead, they sync through the cloud in real time.

This means:

  • Your caretaker can check in from anywhere in the world
  • Adding a new face in the app makes it recognizable on the glasses instantly
  • If the glasses briefly lose connection, data syncs back when they reconnect
  • All data is encrypted in transit and at rest

Languages

All three system components — glasses, app, and cloud voices — support English, Chinese (Mandarin), and Malay.

Response Time

From button press to voice response, VISION aims for a total response time of under 10 seconds for most commands — often much faster for simple requests.

What Happens When You Speak a Command

  1. You press and hold the button — the glasses confirm with a small vibration
  2. You speak your command naturally
  3. You release — the glasses confirm with a second vibration
  4. The glasses understand what you said and pick the best-matching command
  5. The right module runs (object detection, navigation, messaging, etc.)
  6. The glasses speak the result — you can tap to skip or double-tap to stop

See User Flow for detailed walkthroughs of each interaction.