Sunday 6 December 2015

Assignment on Data Analytics Report

 Descriptive Analytics and Visualisation for Advanced data analysis report


This assignment requires you to analyse a given data set, interpret and draw conclusions from your analysis, and then convey your conclusions in a written report to a person with little or no knowledge of Business Analytics. Analysis of the data requires the use of techniques predominately studied in Module 2 (but will also require some techniques from Module 1).

Case Study

Baycoast is a (fictitious) local government area (called a 'city') within greater Melbourne, Australia. It consists of a number of different suburbs, all with their own history of development. The city grew in different stages, with new suburbs gradually emerging. It covers some wealthy suburbs and some not so wealthy. As the name would indicate, the city is located on the Bay.
The city stretches for several kilometres along the Bay's lovely beaches, and for several kilometres inland. About 60,000 people live in the suburbs of Baycoast.
The main objective is to conduct exploratory, descriptive and causal analysis is to gain a comprehensive understanding of house prices in the Baycoast region and an understanding of the most important factors that impact prices. Your analysis will be based on a random sample of 120 houses from the city. Note that for the purpose of the assignment the unit of analysis is a ‘House’. It is defined as a stand-alone dwelling. That is, flats, apartments, etc are not included in the database.
The assignment requires five separate tasks:
1.   An overall view of house prices in Baycoast.
2.  Identification of the main factors influencing house prices
3.  Development of a multiple regression model for prices.
4.  Some basic time series analysis of house prices.
5. Discuss the suitability of the data set along with other potential data sources and approaches for the purpose of this analysis.
Further details of each task is given below.

The Data

The cross-sectional data collected contains a number of categorical and numerical variables which are described below:

Price
Selling price of house in $'000
Rooms
Number of main rooms in the house
Lot Size
Area of the block of land (lot) in square metres
Age
Age of the house in years
Area
Area of the house in square metres
Material
Timber = 1, Veneer = 2, Brick = 3
To Train
Distance of the house to the nearest train station (kilometres)
To Bus
Distance of the house to the nearest bus stop (kilometres)
To Shops
Distance of the house to the nearest shopping centre (kilometres)
Street
Street appeal as evaluated by the real estate agency:
ranges from 0 (lowest appeal) to 10 (highest appeal)
Storeys
Number of storeys or levels in the house
Style
Traditional Style = 0, Non-Traditional Style = 1
Bedrooms
Number of bedrooms
Bathrooms
Number of bathrooms
Kitchen
Style of kitchen: Adequate = 0, Modern = 1
Heating
Central or other heating system installed: No Heat = 0, Yes Heat = 1
AirCon
Air conditioning installed: No AC (No AirCon) = 0, AC (Yes AirCon) = 1
Bay Views
Proportion of views of the Bay from a prominent part of the property:
ranges from 0 = Nil views up to 1 = Full views
Suburb
Three different suburbs: 1  = Brightly, 2 = Tarron B, 3 = Millard
Weekly Rent
$

Actual or estimated weekly rent in $.
Rental Return %
Annual rate of return from rent income (Weekly rent x 52)/(Price in $'000) as a percentage (%)
Condition
The condition of the house in general. Very Poor = 1, Poor = 2, Good = 3, Excellent = 4

Rental Status
Vacant (available for rent) = 1; Rented (currently rented) = 2; Owner (occupied by owner) = 3

In addition, time series data is available on Quarterly Median House Prices

Time Period
Time Period Index
Quarter
Quarter Description
Median House Price ($'000)
Median House price in $'000

Task One Summary of House Prices

Only analyse Price by itself. The importance of other variables is considered in other tasks. You should, at the very least, thoroughly investigate relevant summary measures (and their reliability) for this variable. Also, there may well be suitable tables and graphs that will illustrate, further and more clearly, other important features of house prices. In your report you should comment, where relevant, on data location, central tendency, variability, shape and outliers for this variable.


 Task Two Factors influencing house prices

Analyse house prices against other variables included in the data set. Use appropriate descriptive techniques such as cross-tabulations, comparative summary measures, scatter diagrams to identify key relationships. In your report you should only include the most important factors that impact house prices (approximately between 3 5 factors).

 
Task Three Development of a multiple regression model

You should follow the model building process outlined in topic 5. You are only required to consider linear relationships in the model. Each stage of developing your model should be included in your analysis. You will notice in the Baycoast spreadsheet that there are tabs called Q3-1, Q3-2, etc. These are where you place each version of your model. Note that if you have undertaken more iterations of the model then add more worksheets.
The report should only include your final model and a description of its overall strength as well as the influence of each variable.

Task Four Time Series analysis

Quarterly median house prices in Baycoast from Q4, 2009 to Q3, 2013 are given in QtrPriceData worksheet. Develop a multiplicative time series model to forecast median house prices for the next 4 quarters (Q4, 2013 to Q3, 2014).
If the observed values for those 4 quarters are as below, calculate the MAPE of the forecast.

Time Period
Quarter
Observed
17
2013-Q4
980
18
2014-Q1
1062
19
2014-Q2
1206
20
2014-Q3
954


Task Five Critique the Business Research Approach

Discuss the suitability of the general business research approach taken. In your response, include possible alternative approaches and other sources of (secondary) data. If the analysis was to be repeated in the future, would you recommend a different approach? Note that no actual analysis is required for this task

 Submission
You are required to submit both your written report (approx. 2000 words) and analysis (in Excel).
 Report (40%)
The report should be written for an audience that has no or minimal business analytics background. You should avoid the use of technical terms and mathematics. The one exception may be in task 3 as you may want to include the actual regression model in the report. You are required to describe all five tasks. It is up to you how to structure and format the final report.

 Analysis (60%)

The analysis should be submitted in the appropriate worksheets in the Excel file. Ie. all analysis for task one should be included in tab ‘Q1’, task two in ‘Q2’, etc. Each step in the model building for task three should be included in the tabs Q3-Correlation, Q3-1, Q3-2, etc. If you need more worksheets then add them. Further instructions are included at the top of each worksheet.
Before submitting your analysis make sure it is logically organised and any incorrect or unnecessary output has been removed. Marks will be penalised for poor presentation or disorganised/incorrect results.
Approximate breakdown of marks for the analysis are task 1 (10%), task 2 (10%), task 3 (30%),

task 4 (10%), and total for analysis (60%)

No comments:

Post a Comment