Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

Need to aggregate value by month in R script forecast report

Hi All,

I am very new to R script, i have a r script where i need to edit as per requirement, below is forecast chart where it shows the actuals and forecast values by days, i want to show the summarize value by month. And second image is the requirement.

1) Forecast1) Forecast2) Required chart2) Required chart

Here is my R code-using two columns from dataset for dates and values
### @date 2017-07-28
### R Script used for displaying a graph containing 3-month forecast
### aggregated by day.

Sys.setenv(TZ='UTC')

# Libraries
library('ggplot2')
library('forecast')
library('tseries')
library('lubridate')

 

# Determine which column is the date column
firstCol = dataset[1,1]
secondCol = dataset[1,2]
isFirstDate <- grepl("[0-9]+-[0-9]+-[0-9]+", toString(firstCol))
isSecondDate <- grepl("[0-9]+-[0-9]+-[0-9]+", toString(secondCol))
dateColumn <- 1
valuesColumn <- 2
if(!isFirstDate){
dateColumn <- 2
valuesColumn <- 1
}
theDates <- data.frame(dataset[dateColumn])
names(theDates) <- c("date")
theValues <- data.frame(dataset[valuesColumn])
names(theValues) <- c("values")
dataset <- cbind(theDates, theValues)

# Get information on dataset, create date sequence
count <- nrow(dataset)
firstdate = dataset$date[1]
firstyear = as.double(substring(dataset$date[1], 1, 4))
lastdate = dataset$date[count]
lastyear = as.double(substring(dataset$date[count], 1, 4))
dataset$date <- sprintf("%s%s", dataset$date, " 00:00:00 UTC")
dataset <- dplyr::arrange(dataset, date)

# Fill in the empty dates with zeros
# Get base values
i <- 2
x <- data.frame(dataset$values[1])
d <- data.frame(dataset$date[1])
names(x) <- c("values")
names(d) <- c("date")

# Loop through each value
while(i <= count){

# Get current day, month values
thisMonth <- substring(dataset$date[i],6,7)
lastMonth <- substring(dataset$date[i-1],6,7)
today <- substring(dataset$date[i],9,10)
yesterday <- substring(dataset$date[i-1],9,10)

# Add missing dates if applicable

# Add as many days as differenced from last date entry
if( ((as.numeric(yesterday) + 1) != as.numeric(today)) & (thisMonth == lastMonth)){

# How many days do we need to fill in?
daysToFill <- as.numeric(today)-as.numeric(yesterday)-1

# Fill in the days
while(daysToFill > 0){
# Add to new dataset
zeroValue <- data.frame(0)
names(zeroValue) <- c("values")
x <- rbind(x, zeroValue)
dateValue <- as.Date(sprintf("%s-%s-%s",substring(dataset$date[i],1,4), thisMonth, toString(as.numeric(substring(dataset$date[i],9,10)) - daysToFill)))
dateValue <- data.frame(sprintf("%s%s", as.Date(dateValue), " 00:00:00 UTC"))
names(dateValue) <- c("date")
d <- rbind(d, dateValue)
# decrement the daysToFill counter
daysToFill <- daysToFill - 1
}

# Case when the first days of month are skipped
} else if(thisMonth != lastMonth){
# Get amount of days to fill
daysToFill <- as.numeric(today)-1

# Add each day missing
while (daysToFill > 0){
# Append dataset
zeroValue <- data.frame(0)
names(zeroValue) <- c("values")
x <- rbind(x, zeroValue)
dateValue <- sprintf("%s-%s-%s",substring(dataset$date[i],1,4), thisMonth, as.character(daysToFill))
dateValue <- data.frame(sprintf("%s%s", as.Date(dateValue), " 00:00:00 UTC"))
names(dateValue) <- c("date")
d <- rbind(d, dateValue)

daysToFill <- daysToFill - 1
}
}

# If no missing values exist, then continue and then add to original dataset
i <- i + 1
}

# Append the dataset, then sort by date
dataset <- rbind(dataset, cbind(d,x))
# update after transforming the dataset
count <- nrow(dataset)
dataset <- dplyr::arrange(dataset, date)

#### Aggregate the Data (by day) ####
# Get base values
i <- 2
x <- data.frame(dataset$values[1])
d <- data.frame(dataset$date[1])
names(x) <- c("values")
names(d) <- c("date")
ptr <- 1


# Loop through each value
while(i <= count){

# Get current and previous months and years
thisMonth <- substring(dataset$date[i],6,7)
lastMonth <- substring(dataset$date[i-1],6,7)
today <- substring(dataset$date[i],9,10)
yesterday <- substring(dataset$date[i-1],9,10)

# Compare month values, either greater or lower
# Also compare year
if( (today > yesterday) || (thisMonth != lastMonth)){

# Increment pointer
ptr <- ptr + 1

# Add new row to Date representing new day
de <- data.frame(dataset$date[i])
names(de) <- c("date")
d <- rbind(d, de)

# Add new row to Values representing new aggregate
dx <- data.frame(dataset$values[i])
names(dx) <- c("values")
x <- rbind(x, dx)

# Otherwise, aggregate data
}
else if( today == yesterday){

x$values[ptr] <- x$values[ptr] + dataset$values[i]

}

# Finish loop, increment i
i <- i + 1
}

dataset <- cbind(d,x)

# update after transforming the dataset
count <- nrow(dataset)

# Create time series object and forecasted ARIMA fit model
y = ts(dataset$values, start=c(firstyear, yday(firstdate)), frequency=365)
y <- tsclean(y)
fit = forecast(auto.arima(y), 95)

# Plot everything and format axis
plot(fit, xaxt="n",main="Forecasted Headcount Requirements (per Day)", ylim=c(0,max(y)),cex.main=1.5)
a = seq(as.Date(firstdate, tz = "UTC"), by="month", length=95 + count)
axis(1, at = decimal_date(a), labels = format(a, "%Y %b %d"), cex.axis=1.0, ylab = 'value',FUN = sum)
abline(v = decimal_date(a), col='grey', lwd=0.5)

 

 

 

 

 

1 ACCEPTED SOLUTION
v-yuezhe-msft
Employee
Employee

@Anonymous,

Please take a look at the following blogs and similar thread about how to group data by month using R.

Plot Weekly or Monthly Totals in R
Group data by month in R
Aggregate Daily Data to Month/Year intervals

In addition, why not directly create bar chart in Power BI Desktop?

Regards
Lydia

Community Support Team _ Lydia Zhang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

3 REPLIES 3
v-yuezhe-msft
Employee
Employee

@Anonymous,

Please take a look at the following blogs and similar thread about how to group data by month using R.

Plot Weekly or Monthly Totals in R
Group data by month in R
Aggregate Daily Data to Month/Year intervals

In addition, why not directly create bar chart in Power BI Desktop?

Regards
Lydia

Community Support Team _ Lydia Zhang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Anonymous
Not applicable

Hi Lydia,

 

Thanks for the response, i have gone through the links, but as i am new to R not able to get required output. And let me know how to create bar chart for forecasting.

 

Thanks

Rajmohan

@Anonymous,

What is the sample data of your table? You can firstly execute the script in R studio and then apply them in Power BI.

Regards,
Lydia

Community Support Team _ Lydia Zhang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.