Internet Explorer 7, 8, and 9 are no longer supported. Please use a newer browser.
Concourse works best with JavaScript enabled.
San Jose State University logo

College of Professional and Global Education · School of Information

Big Data Analytics and Management
INFM 203

  • Fall Session II 2023
  • Section 10
  • 2 Unit(s)
  • 10/09/2023 to 12/06/2023
  • Modified 07/17/2023

Canvas Information

This course will be available on Canvas October 9th, 6 am PT. (Beginning of Informatics Session II)

You will be enrolled in the Canvas site automatically.

Contact Information

Dr. Prateek Jain
Office: Virtual
Phone: (408) 924-2490
Office Hours: Virtual office hours. Telephone and in-person advising by appointment


Course Information

1. If you would like to meet me via Zoom, please give me advanced notice (48 hours minimum). Send me an email to my SJSU email and we will coordinate the appointment time slot.

2. If you are behind in assignments or course due to personal/family/work issues which is more than 2 days, please get approval from graduate/program coordinator and have them contact me with approval. No exceptions.

3. You might be asked to review others material as peer review. This is to expose you to others thinking and NOT to offload my work. 

Course Description and Requisites

Overview and use of important big data technologies, trends, infrastructure, and management issues that enable users to make informed and strategic decisions with the presence of large-scale data sets.


Graduate Standing or Instructor Consent.

Classroom Protocols


Students are expected to participate fully in all class activities. It is expected that students will be open-minded and participate fully in discussions in class and debate in a mature and respectful manner. Use of derogatory, condescending, or offensive language including profanity is prohibited. Disagreement is healthy and perfectly acceptable. Expressing disagreement should always include an explanation of your reasoning and, whenever possible, evidence to support your position. In accordance with San José State University's Policies, the Student Code of Conduct, and applicable state and federal laws, discrimination based on gender, gender identity, gender expression, race, nationality, ethnicity, religion, sexual orientation, or disability is prohibited in any form.

Program Information

Course Workload

Success in this course is based on the expectation that students will spend, for each unit of credit, a minimum of forty-five hours over the length of the course (normally 3 hours per unit per week with 1 of the hours used for lecture) for instruction or preparation/studying or course related activities including but not limited to internships, labs, clinical practica. Other course structures will have equivalent workload expectations as described in the syllabus.

Instructional time may include but is not limited to:
Working on posted modules or lessons prepared by the instructor; discussion forum interactions with the instructor and/or other students; making presentations and getting feedback from the instructor; attending office hours or other synchronous sessions with the instructor.

Student time outside of class:
In any seven-day period, a student is expected to be academically engaged through submitting an academic assignment; taking an exam or an interactive tutorial, or computer-assisted instruction; building websites, blogs, databases, social media presentations; attending a study group; contributing to an academic online discussion; writing papers; reading articles; conducting research; engaging in small group work.

Course Goals

SLOs and PLOs

This course supports Informatics SLO 3: Demonstrate proficiency in using current big data and electronic records technologies to solve analytical problems; including developing policies, standards, and practices in particular specialized contexts and interpreting and communicating analysis and visualization results appropriately and accurately.

SLO 3 supports the following Informatics Program Learning Outcomes (PLOs):

  • PLO 2 Evaluate, manage, and develop electronic records programs and applications in a specific organizational setting.
  • PLO 3 Demonstrate strong understanding of security and ethics issues related to informatics, user interface, and inter-professional application of informatics in specific fields by designing and implementing appropriate information assurance and ethics and privacy solutions.
  • PLO 6 Conduct informatics analysis and visualization applied to different real-world fields, such as health science and sports.

Course Learning Outcomes (CLOs)

Upon successful completion of the course, students will be able to:

  1. Describe and explain how the main technologies and trends in big data work, specifically data visualization, large-scale database management, map-reduce paradigm, and big data mining.
  2. Demonstrate proficiency in using current big data technologies to solve big data analytical problems.
  3. Interpret and communicate big data analysis and visualization results appropriately, effectively and accurately.
  4. Discuss, articulate and compare various big data management issues (e.g., big data privacy).

Course Materials


No Textbooks For This Course.

Course Requirements and Assignments

All assignments are due by Sunday Midnight PT at the end of the week in which scheduled as noted in the table below. Practical Labs work is evidenced as completed by submitting an MS Word file (YourLastName,YourFirstName-WeekNN-Report.docx) to the discussion section and the appropriate week's folder for this course on Canvas. Assignments are subject to change with fair notice.

Weekly Reports (20% of the overall grade supports CLO 2)

Individual, hands-on practices will be given throughout the semester to help students review and reinforce what they have learned in class. These reports are an important part of the course. They generally require the submission of a report (MS Word File, .docx) to the Canvas website for the course.


  • The expectation with these reports is to give you hands-on practice by working on case studies relevant to the topic.
  • Reports should be minimum two pages and single space.
  • The reports must be summited by the deadline. Late submission will result in deduction of points automatically by Canvas, depending on the delay. 
  • Don't take the reports lightly. Expectation is to see material created by a graduate student. If you are not sure what a graduate student report requires, please see this reference . You will use the relevant sections and material. 
  • Plagiarism is NOT OK under any circumstances. Any cases will be referred to the department.
  • The material presented in your reports should be related to the topic. Any material which is inserted to increase the length of the report will result in zero points.

Weekly MCQ quiz (25% of the overall grade, supports CLO 2)

Weekly Multiple Choice Quizzes (MCQ) related to the material covered in the week. Every effort will be made to ensure the questions are directly related to the material, it is possible that some questions require you to infer the information (aka thinking). The objective is to make you think about potential questions you might encounter in the process of applying Big Data tools and technologies. Weekly quiz will be administered via Canvas. You will get three attempts, and the highest score will be kept.

Semester Mini-Project (30% of the overall grade, supports CLOs 1-4)

Students will work in teams or alone on a Mini-Project that consists of three phases (more details TBA on the Canvas site). The main requirement of the project is that it uses at least one topic covered in the class.

  • Milestone I - Initial Thoughts: Students will submit a short paragraph discussing the potential topics and directions of the semester project. Students will also briefly present the motivation for the study and the approach that might be taken.

Emphasis in the Project is on understanding the chosen data set, data gathering, data munging/preparation, and the steps leading towards data analysis (this work is generally 60-80% of any data project). Heavy analysis and computer programming are beyond the scope of the project.

  • Milestone II - Mini Report: Students will submit a one-page report outlining the current progress of the project. The report will include what has been done, what the current status and results are, and what needs to be accomplished.
  • Final Report & Demo: Students will submit a detailed, 10-page report for the project. The report should at least include the following sections: motivation, problem statement, methodology, analysis results, discussion, and any conclusions. Students will also prepare a short “demo” to present and discuss their work.

You will be provided with a template/sample report to use for guidance. These reports are essential part of the course. Please read the guidelines related to Weekly Reports as they apply to the project report as well.

Final Exam Online (Multiple-Choice Questions [MCQ] + Short-Answer Questions)
(25% of the overall grade, supports CLOs 1-4)

Grading Information

Grading will be based on a total accumulation of possible 100 percent, distributed as follows:


Percent of overall grading (Total = 100%)

Weekly Reports


Semester Mini-Project

Milestone I: 2%
Milestone II: 3%
Final Report & Project Demo: 25%

Grading is based on the submissions plus constructive comments on at least two other-student Projects.

Weekly Multiple Choice Quiz


Final Exam


These deliverables will be graded using larger point values, but the totals for each type of deliverable will be scaled to the relative percentages shown in this table.

Grading Information

This is a Credit/No Credit course. Incompletes will only be awarded in the case of serious medical or family issues (with appropriate documentation supplied).

University Policies

Per University Policy S16-9 (PDF), relevant university policy concerning all courses, such as student responsibilities, academic integrity, accommodations, dropping and adding, consent for recording of class, etc. and available student services (e.g. learning assistance, counseling, and other resources) are listed on the Syllabus Information web page. Make sure to visit this page to review and be aware of these university policies and resources.

Course Schedule

This schedule and related dates/readings/assignments is tentative and subject to change with fair notice. Any changes will be announced in due time in class and in the Canvas Learning Management System (LMS). The students are obliged to consult the most updated and detailed version of the reading material and syllabus, which will be posted on the course’s website.

Week# — Starting Date




Oct 9th


  1. What is Big Data? What is Data Analytics?
  2. Roles: Business Analyst, Data Engineer, Data Scientist

Project Initialization


Oct 16th

1, 2

  1. Hadoop & HDFS
  2. MapReduce and Distributed Computing

Week 1 Report Due

Week 1 MCQ Quiz Due



Spring Recess


Oct 23rd

  1. Spark
  2. The Hadoop Ecosystem

Week 2 Report Due

Week 2 MCQ Quiz Due


Oct 30th

  1. Data Lakes / Data Fabric / Cloud
  2. Relational Databases & the NoSQL movement

Week 3 Report Due

Week 3 MCQ Quiz Due


Nov 6th


  1. Data Movement

Multiple Choice Quiz (MCQ)

Week 4 Report Due

Week 4 MCQ Quiz Due


Nov 13th


  1. Tableau for Data Exploration
  2. Visualization using Tableau
  3. EDA (Exploratory Data Analysis)

Week 5 Report Due

Week 5 MCQ Quiz Due


Nov 20th


  1. Management, Governance, and Data Security

Week 6 Report Due

Week 6 MCQ Quiz Due


Nov 27th


Course Review
Project Presentations / Submissions

Week 7 Report Due

Week 7 MCQ Quiz Due


Dec 4th


Last Day of Instruction

Final Exam (MCQ + Short Answer) (on Canvas)
Final Practical Exam (sent to Instructor by email)