# QA Report for Tag QA_July2025_A

- **Tag/Commit:** QA_July2025_A (`f32abc1`)
- **Date:** 2025-08-01
- **QA Outcomes:** All tests passed except for file upload (see below)...
- **Environment:** Windows 10, Python 3.12, requirements.txt as of tag
- **Notes:** See log for detailed results.

## Issue Found
- **Step:** 
- **Bug:** 
- **Log excerpt:**
- **Screenshot:**

For this QA run, we begin with pristine state by:

Stop all services
Delete logs.db, and jobs.db
Delete all files in doc_store folder
Start all services
Test, you should see no data on index page, and query should result in no data found error

Next, go ahead and upload new1.txt, and after that new2.txt

Issue the query: "can you summarize all of the todo items you can find, seperating each one on a single line, with due date first, then title, and description with any details you can augment"

Expected Answer:

- August 1, 2025: Respond to Amira and discuss weekend plans - August 3, 2025: Go to Somesh class and finish taking notes for the week

## Issue Found

2025-08-03 | Issue-Critical: One login page is down

- **Step:** After change .env, and cPanel secret values to non-defaults, and introducing code that attempts to safegaurd by terminating app -> hosed app, and One Login no longer worked

- **Bug:** 
- **Log excerpt:**
- **Screenshot:**

Solution:

Using code base labels:

identity-backend

- QA_July2025_A_Checkpoint1 - read more about Git labels

echo-private

- QA_July2025_A
- QA_July2025_A_Checkpoint1

1. Issue with toggle OpenAI API/Locally hosted Ollama:

## Issue Found - **Step:** Upload > Query with model=Ollama - **Bug:** Ollama misses action item “Amira” - **Log excerpt:** (paste relevant logs) - **Screenshot:** (if available)

Details: Link

2. Issue with running worker.py and jobs table missing

- **Step:** Upload > Query with model=Ollama - **Bug:** Ollama misses action item “Amira” - **Log excerpt:** (paste relevant logs) - **Screenshot:** (if available)

https://saadazizai.blogspot.com/2025/08/issue-running-python-workerpy-from.html

It's brittle this way, should have better solution, tbd

## Bug Fix: Unable to start service after data purge
- **Action:** No fix.
- **Outcome:** Use workaround, start services in order:

 -- logging_service.py

 -- parser_service.py

 -- api_gateway.py

 -- worker.py

## Cpanel

A full multi-microservice app like Echo (with different ports for each service) will not run natively on most shared cPanel setups. | Read more

https://saadazizai.blogspot.com/2025/08/cpanel-architectural-limitations-single.html

3. Code issues: Log consistency, and verbosity

## Issue Found
- **Step:** 
- **Bug:** 
- **Log excerpt:**
- **Screenshot:** 
1) first, add changes to query endpoint so that it propogates our changes to route to backend properly (similar to query-ui)
2) second, let's abstract away redudant code, for example query, and query-ui. Also potentially related, why repeat code here, and why order of operations necessary or worker script crashes with job sql error
3) last, let's make sure everything is logged internally and using our global service!!! consistency, dry, etc
4. Prompt issues - openAI wants specificity (todo: date, day, time)
## Issue Found
- **Step:** 
- **Bug:** 
- **Log excerpt:**
- **Query:** list each todo item across all documents, make sure to include date, day, time details
- **Screenshot:** 

Notice, the query here becomes: "list each todo item across all documents"

Read more: Link



5. Prompt Issues - Open ai is unable to remove todo items who due dates have passed

6. August 3rd, 12:51 AM (Sunday) - End of day notes
identity-backend - core service
high-level review of your identity-backend MVP security as implemented, what’s fine for MVP, and what’s risky or needs improvement for production.


We’ll focus on:
OAuth2/OIDC flow basics
Token security
Credentials and secrets management
Surface area for leaks/bypass
Logging & auditability
Common attack surface
1. What you’ve implemented — High Level (MVP):
A. OAuth2 Authorization Code Flow (for browser-based login)

/authorize — accepts client_id, redirect_uri, etc. and renders login page
/login — verifies user (static username/password), issues one-time code, redirects with ?code=...
/token — exchanges code for JWT id_token (acting as access_token), with single-use code and client credentials check
B. JWT Service-to-Service & User Token
Tokens are short-lived (exp), signed, and audience/issuer-checked
Client secrets are enforced (basic security for /token endpoint)
JWT secret/issuer are loaded from config/env
C. Database
Auth codes stored in SQLite (authcodes.db), single-use, deleted after redemption
D. Logging
Custom unified logging — both to stderr (cPanel) and external service via JWT-secured POST
E. Config & Secrets
Uses .env and config.py for secrets and environment variables (with dev fallbacks)
F. CORS, XSS, CSRF
No explicit handling shown (common for MVP, but more later...)

2. MVP Security: What’s Acceptable
For an MVP, you meet many basic best practices:
One-time codes for auth (prevents replay attacks).
JWTs are short-lived, signed, with standard claims (iss, aud, exp, iat, etc.).
Secrets configurable by environment.
Logging includes warnings and info for all critical events (auth attempt, code issued, login fail, etc.).
Hardcoded credentials (for quick testing only!).
Session/cookies only used to store state between authorize/login (not used for access control in backend).

3. What’s NOT Secure for Production (Needs Attention Soon)
A. Hardcoded credentials
USERNAME = "username"; PASSWORD = "password"

Why: Obvious, but even for MVP, this is hackable in seconds.

How to Fix: At minimum, move to .env, or better, load from hashed DB table.

Must fix for any external demo!

B. Secrets in Code
app.secret_key = "dev-secret"
JWT_SECRET_KEY = os.getenv("JWT_SECRET_KEY", "dev-secret")

Why: "dev-secret" in code means anyone with access can forge tokens.

How to Fix: Set all secrets only via environment variables.\

C. No HTTPS Enforcement (in code)
No @app.before_request to enforce HTTPS.

Why: Without HTTPS, tokens and auth codes leak in transit (easy MitM).

Fix: Always run behind HTTPS proxy.

MVP OK if running only on localhost/behind trusted firewall.

D. Weak Redirect URI Check
Only checks startswith(); can be tricked with URLs like http://localhost:5000.evil.com/callback

How to Fix: Parse and strictly match hostname/scheme.

MVP OK if you fully control the client and are not exposing to third parties.

E. No CSRF/XSRF Protection
/login form posts with no CSRF token.

Risk: Session fixation, CSRF in browser flows.

Fix: Flask-WTF CSRF or similar.

F. Session Storage
Using Flask session for client_id, etc., with a dev secret (not ideal, but not a major risk for MVP, as it’s not auth session).

G. Audit Logging
Logging is great, but watch for accidental logging of secrets, passwords, or tokens.

H. Missing: Rate limiting, lockout on login failure, brute-force protection
No account lockout or rate limiting on /login.

Risk: Password brute-forcing.

Fix: Flask-Limiter or similar, especially for public deployments.

I. JWT Key Rotation
No support for rotating JWT signing keys or for exposing public keys for verification.
MVP: Fine if all services are under your control.

J. Error Handling/Info Leaks
Some errors could reveal sensitive info (e.g., “Invalid client_secret for X”).
Avoid echoing which fields were wrong in production.

4. What Looks Professional / You Won’t Look Like an Idiot
This MVP is better than most “first time” attempts:
You’re actually following the real OAuth2 flow.
Codes are single-use and stored in a DB.
JWT tokens have proper claims, exp, etc.
Logging is robust.
Configs are centralized.
You’ll look pro IF you:
Immediately fix hardcoded credentials.
Never use "dev-secret" in prod.
Make clear in your docs (README is decent!) what’s intentionally simplified for MVP and what’s a must-fix before launch.
Bonus: Add a comment/TODO in code at each “dangerous” place. (e.g., # TODO: Move credentials to secure DB before prod)

5. Summary Table



Area
MVP Grade
Production Grade
Must-Fix for External Demo?
Note




Hardcoded creds
OK
FAIL
Yes
Move to .env or DB now


JWT secret in code
OK
FAIL
Yes
Use env var, never code


HTTPS only
OK
FAIL
Yes
Use behind proxy, never expose HTTP


Code single-use
PRO
OK
No
Use short TTL in DB (OK for now)


Redirect check
MVP
OK/FAIL
Only if 3rd party clients
Use strict parse for prod


CSRF on forms
OK
FAIL
Only if browser UI exposed
Flask-WTF for prod


Logging
PRO
PRO
No
Good central logging


Rate limit/lockout
OK
FAIL
Only if open on web
Add Flask-Limiter




6. Bottom Line for MVP Demo
DO NOT share the public URL with anyone outside your team unless you’ve set real secrets and hidden the creds.
List “known shortcuts for MVP” in your docs to show you’re aware (this always looks good to reviewers and engineers).

Client Application review
Great, let's review your client app’s security posture and architecture.

You’ve built a Flask-based API Gateway/UI (plus Worker, Parser, Logging microservices), now enforcing login using your identity-backend. I’ll focus on:
Where you nailed the MVP (good practices, clear wins)
What is a liability, or will make you look unprepared to a reviewer
The flow from login to API use—holes, mitigations, and “easy wins”
Actionable advice (“MVP fine” vs “fix ASAP”)


1. High-Level Review: OIDC/OAuth2 Flow in Client
How it works:
/login → redirects user to identity-backend /authorize
/callback → gets an auth code from OIDC, exchanges for id_token via /token at the identity service (with client_secret and client_id)
Stores id_token (JWT) in session
All protected routes call require_login() which checks/validates token before rendering anything private (e.g., /, /upload, /query-ui)
Logout clears session

2. What’s Strong / Looks Professional (MVP)
Uses a real OIDC-like flow—code exchange, client_id, client_secret, redirect_uri all present
State parameter is generated and validated to prevent CSRF on OIDC flow
Audience, issuer, expiry checks on JWT (even if signature verify is skipped for now)
Logs every step—including token hashes (not raw tokens!), source IP, and UA—showing traceability and audit effort
Session secret is configurable and not hardcoded (FLASK_SECRET_KEY)
Environment-based config for secrets, API keys, etc.
Centralized logging utility (nice touch)
Logs endpoint is clearly marked as public and temporary, with a warning in code


3. What’s MVP Only / Needs Attention Before Launch

A. Token Signature Verification
Current: Skips signature verification (options={"verify_signature": False} in jwt.decode)
Why risky: Anyone can forge a token with correct claims
How to fix: Always verify signature with the real secret (available in all your services)
MVP ok for dev/test, but will look “rookie” if left before review
B. Session Security
Flask session cookie is signed, but not set to Secure or HttpOnly by default.
Set SESSION_COOKIE_SECURE = True and SESSION_COOKIE_HTTPONLY = True if running behind HTTPS.
C. Client Secret Handling
Secret is in .env (good for MVP), but do not check this into git
No brute-force or rate-limiting on code/token exchange; risk is lower in MVP, but consider Flask-Limiter in prod
D. JWT in Session Only
This means APIs you expose to browser clients are as secure as your session—fine, but if you add APIs intended for pure API access, you’ll want to accept JWT in the Authorization header and verify it.
E. Logs Endpoint
/logs and /logs.json are public and marked as such—do not forget to restrict this in prod.
Add a check like if not is_logged_in(): return redirect(...)
Leaving logs public is a classic "rookie mistake" in prod.
F. CSRF
State param is handled on login/callback (great!)
No CSRF tokens for /upload, /query-ui forms—fix if allowing non-API client use
G. Token Scope and Claims
No fine-grained authorization (just authentication)
No RBAC, but that’s fine for MVP
H. OIDC Redirect URI Security
Only one allowed redirect URI (http://localhost:5000/callback), and is checked, but just as string.
If user-supplied, parse with urllib and match hostname/protocol, not just startswith
I. Error Handling/Info Leaks
Error pages can leak too much info in debug—wrap in custom error handler for prod
J. OpenAI Key Management
Reads from env, but never expose this to browser clients (yours is safe as backend only)


4. “Showstopper” Issues to Fix Before Going Public
Token signature must be verified in verify_id_token() (high severity if not fixed)
Remove/secure the logs endpoint
Set session cookie secure/httponly
Remove any hardcoded/test secrets before git push/public deploy


5. Nice-to-Have Improvements (for Reviewer/Professional Polish)
Add SESSION_COOKIE_SAMESITE = 'Lax' or 'Strict' to prevent cross-site session attacks
Use .env for all secrets, never fallback to “dev-” defaults in prod
Add rate-limiting on login/token/callback endpoints
Implement proper error page templates for 400/401/500s
Add X-Content-Type-Options: nosniff, X-Frame-Options: DENY headers for bonus security


6. Your Actual Flow: Any Holes?
All protected endpoints go through require_login() → good
Session-stored JWT means browser users can't access APIs directly (by design)—safe for web UIs
If running only on localhost or dev, current setup is fine.
If deploying externally, you must enforce HTTPS and verify JWT signatures everywhere


7. Checklist Table




Area
MVP Grade
Prod Grade
Must-Fix for Review/Demo
Notes




OIDC code flow
PRO
PRO
No
Well-implemented, clear state mgmt


JWT signature verify
OK
FAIL
Yes
Remove verify_signature: False


Logs endpoint public
OK
FAIL
Yes
Restrict before prod


Session cookie security
OK
OK
Yes
Add Secure/HttpOnly/SameSite


.env secret mgmt
PRO
PRO
No
Good, just don’t check into git


CSRF tokens on POST
OK
FAIL
If browser upload used
Use Flask-WTF if necessary


Rate limiting
OK
FAIL
No
Flask-Limiter for brute force in prod




8. Example: How to Fix JWT Signature Verification

Change this:

claims = jwt.decode(token, options={"verify_signature": False})

To:

claims = jwt.decode(token, os.getenv("JWT_SECRET_KEY"), algorithms=["HS256"])



Make sure the secret is set in your env.



9. Final Verdict (for MVP and Professional Impression)

This is a solid MVP—you’ve gone much further than most (state, audience, issuer, expiry checks, logging, state validation).
You WON’T look like an idiot as long as you:
Fix signature verification
Secure your logs
Tighten up session cookie security
Remove “dev” secrets

Area	MVP Grade	Production Grade	Must-Fix for External Demo?	Note
Hardcoded creds	OK	FAIL	Yes	Move to .env or DB now
JWT secret in code	OK	FAIL	Yes	Use env var, never code
HTTPS only	OK	FAIL	Yes	Use behind proxy, never expose HTTP
Code single-use	PRO	OK	No	Use short TTL in DB (OK for now)
Redirect check	MVP	OK/FAIL	Only if 3rd party clients	Use strict parse for prod
CSRF on forms	OK	FAIL	Only if browser UI exposed	Flask-WTF for prod
Logging	PRO	PRO	No	Good central logging
Rate limit/lockout	OK	FAIL	Only if open on web	Add Flask-Limiter

Area	MVP Grade	Prod Grade	Must-Fix for Review/Demo	Notes
OIDC code flow	PRO	PRO	No	Well-implemented, clear state mgmt
JWT signature verify	OK	FAIL	Yes	Remove `verify_signature: False`
Logs endpoint public	OK	FAIL	Yes	Restrict before prod
Session cookie security	OK	OK	Yes	Add Secure/HttpOnly/SameSite
.env secret mgmt	PRO	PRO	No	Good, just don’t check into git
CSRF tokens on POST	OK	FAIL	If browser upload used	Use Flask-WTF if necessary
Rate limiting	OK	FAIL	No	Flask-Limiter for brute force in prod

QA - Run #1 - Results

# QA Report for Tag QA_July2025_A

1. Issue with toggle OpenAI API/Locally hosted Ollama:

2. Issue with running worker.py and jobs table missing

3. Code issues: Log consistency, and verbosity

4. Prompt issues - openAI wants specificity (todo: date, day, time)

1. What you’ve implemented — High Level (MVP):

A. OAuth2 Authorization Code Flow (for browser-based login)

B. JWT Service-to-Service & User Token

C. Database

D. Logging

E. Config & Secrets

F. CORS, XSS, CSRF

2. MVP Security: What’s Acceptable

3. What’s NOT Secure for Production (Needs Attention Soon)

A. Hardcoded credentials

B. Secrets in Code

C. No HTTPS Enforcement (in code)

D. Weak Redirect URI Check

E. No CSRF/XSRF Protection

F. Session Storage

G. Audit Logging

H. Missing: Rate limiting, lockout on login failure, brute-force protection

I. JWT Key Rotation

J. Error Handling/Info Leaks

4. What Looks Professional / You Won’t Look Like an Idiot

5. Summary Table

6. Bottom Line for MVP Demo

1. High-Level Review: OIDC/OAuth2 Flow in Client

How it works:

2. What’s Strong / Looks Professional (MVP)

3. What’s MVP Only / Needs Attention Before Launch

A. Token Signature Verification

B. Session Security

D. JWT in Session Only

G. Token Scope and Claims

H. OIDC Redirect URI Security

J. OpenAI Key Management

4. “Showstopper” Issues to Fix Before Going Public

5. Nice-to-Have Improvements (for Reviewer/Professional Polish)

6. Your Actual Flow: Any Holes?

7. Checklist Table

8. Example: How to Fix JWT Signature Verification

9. Final Verdict (for MVP and Professional Impression)

Comments

Post a Comment

Popular posts from this blog

Feature: Audit log for one login, and identity service

Getting started - Build your data science lab environment