Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi All,
I am very new to R script, i have a r script where i need to edit as per requirement, below is forecast chart where it shows the actuals and forecast values by days, i want to show the summarize value by month. And second image is the requirement.
Here is my R code-using two columns from dataset for dates and values
### @date 2017-07-28
### R Script used for displaying a graph containing 3-month forecast
### aggregated by day.
Sys.setenv(TZ='UTC')
# Libraries
library('ggplot2')
library('forecast')
library('tseries')
library('lubridate')
# Determine which column is the date column
firstCol = dataset[1,1]
secondCol = dataset[1,2]
isFirstDate <- grepl("[0-9]+-[0-9]+-[0-9]+", toString(firstCol))
isSecondDate <- grepl("[0-9]+-[0-9]+-[0-9]+", toString(secondCol))
dateColumn <- 1
valuesColumn <- 2
if(!isFirstDate){
dateColumn <- 2
valuesColumn <- 1
}
theDates <- data.frame(dataset[dateColumn])
names(theDates) <- c("date")
theValues <- data.frame(dataset[valuesColumn])
names(theValues) <- c("values")
dataset <- cbind(theDates, theValues)
# Get information on dataset, create date sequence
count <- nrow(dataset)
firstdate = dataset$date[1]
firstyear = as.double(substring(dataset$date[1], 1, 4))
lastdate = dataset$date[count]
lastyear = as.double(substring(dataset$date[count], 1, 4))
dataset$date <- sprintf("%s%s", dataset$date, " 00:00:00 UTC")
dataset <- dplyr::arrange(dataset, date)
# Fill in the empty dates with zeros
# Get base values
i <- 2
x <- data.frame(dataset$values[1])
d <- data.frame(dataset$date[1])
names(x) <- c("values")
names(d) <- c("date")
# Loop through each value
while(i <= count){
# Get current day, month values
thisMonth <- substring(dataset$date[i],6,7)
lastMonth <- substring(dataset$date[i-1],6,7)
today <- substring(dataset$date[i],9,10)
yesterday <- substring(dataset$date[i-1],9,10)
# Add missing dates if applicable
# Add as many days as differenced from last date entry
if( ((as.numeric(yesterday) + 1) != as.numeric(today)) & (thisMonth == lastMonth)){
# How many days do we need to fill in?
daysToFill <- as.numeric(today)-as.numeric(yesterday)-1
# Fill in the days
while(daysToFill > 0){
# Add to new dataset
zeroValue <- data.frame(0)
names(zeroValue) <- c("values")
x <- rbind(x, zeroValue)
dateValue <- as.Date(sprintf("%s-%s-%s",substring(dataset$date[i],1,4), thisMonth, toString(as.numeric(substring(dataset$date[i],9,10)) - daysToFill)))
dateValue <- data.frame(sprintf("%s%s", as.Date(dateValue), " 00:00:00 UTC"))
names(dateValue) <- c("date")
d <- rbind(d, dateValue)
# decrement the daysToFill counter
daysToFill <- daysToFill - 1
}
# Case when the first days of month are skipped
} else if(thisMonth != lastMonth){
# Get amount of days to fill
daysToFill <- as.numeric(today)-1
# Add each day missing
while (daysToFill > 0){
# Append dataset
zeroValue <- data.frame(0)
names(zeroValue) <- c("values")
x <- rbind(x, zeroValue)
dateValue <- sprintf("%s-%s-%s",substring(dataset$date[i],1,4), thisMonth, as.character(daysToFill))
dateValue <- data.frame(sprintf("%s%s", as.Date(dateValue), " 00:00:00 UTC"))
names(dateValue) <- c("date")
d <- rbind(d, dateValue)
daysToFill <- daysToFill - 1
}
}
# If no missing values exist, then continue and then add to original dataset
i <- i + 1
}
# Append the dataset, then sort by date
dataset <- rbind(dataset, cbind(d,x))
# update after transforming the dataset
count <- nrow(dataset)
dataset <- dplyr::arrange(dataset, date)
#### Aggregate the Data (by day) ####
# Get base values
i <- 2
x <- data.frame(dataset$values[1])
d <- data.frame(dataset$date[1])
names(x) <- c("values")
names(d) <- c("date")
ptr <- 1
# Loop through each value
while(i <= count){
# Get current and previous months and years
thisMonth <- substring(dataset$date[i],6,7)
lastMonth <- substring(dataset$date[i-1],6,7)
today <- substring(dataset$date[i],9,10)
yesterday <- substring(dataset$date[i-1],9,10)
# Compare month values, either greater or lower
# Also compare year
if( (today > yesterday) || (thisMonth != lastMonth)){
# Increment pointer
ptr <- ptr + 1
# Add new row to Date representing new day
de <- data.frame(dataset$date[i])
names(de) <- c("date")
d <- rbind(d, de)
# Add new row to Values representing new aggregate
dx <- data.frame(dataset$values[i])
names(dx) <- c("values")
x <- rbind(x, dx)
# Otherwise, aggregate data
}
else if( today == yesterday){
x$values[ptr] <- x$values[ptr] + dataset$values[i]
}
# Finish loop, increment i
i <- i + 1
}
dataset <- cbind(d,x)
# update after transforming the dataset
count <- nrow(dataset)
# Create time series object and forecasted ARIMA fit model
y = ts(dataset$values, start=c(firstyear, yday(firstdate)), frequency=365)
y <- tsclean(y)
fit = forecast(auto.arima(y), 95)
# Plot everything and format axis
plot(fit, xaxt="n",main="Forecasted Headcount Requirements (per Day)", ylim=c(0,max(y)),cex.main=1.5)
a = seq(as.Date(firstdate, tz = "UTC"), by="month", length=95 + count)
axis(1, at = decimal_date(a), labels = format(a, "%Y %b %d"), cex.axis=1.0, ylab = 'value',FUN = sum)
abline(v = decimal_date(a), col='grey', lwd=0.5)
Solved! Go to Solution.
@Anonymous,
Please take a look at the following blogs and similar thread about how to group data by month using R.
Plot Weekly or Monthly Totals in R
Group data by month in R
Aggregate Daily Data to Month/Year intervals
In addition, why not directly create bar chart in Power BI Desktop?
Regards
Lydia
@Anonymous,
Please take a look at the following blogs and similar thread about how to group data by month using R.
Plot Weekly or Monthly Totals in R
Group data by month in R
Aggregate Daily Data to Month/Year intervals
In addition, why not directly create bar chart in Power BI Desktop?
Regards
Lydia
Hi Lydia,
Thanks for the response, i have gone through the links, but as i am new to R not able to get required output. And let me know how to create bar chart for forecasting.
Thanks
Rajmohan
@Anonymous,
What is the sample data of your table? You can firstly execute the script in R studio and then apply them in Power BI.
Regards,
Lydia
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
112 | |
100 | |
80 | |
64 | |
57 |
User | Count |
---|---|
146 | |
110 | |
93 | |
84 | |
67 |