Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upsolar sample #688
solar sample #688
Conversation
review-notebook-app
bot
commented
May 13, 2020
|
Check out this pull request on Review Jupyter notebook visual diffs & provide feedback on notebooks. Powered by ReviewNB |
|
@guneetmutreja can you do the first round of review for this? |
review-notebook-app
bot
commented
Jun 8, 2020
|
View / edit / reply to this conversation on ReviewNB guneetmutreja commented on 2020-06-08T11:21:02Z Accessing & Visualizing the datasets 1 — Fully Connected Network (FCN) FCN Model Result Visualization
These contents in the TOC does not match with headings in the NB below. |
review-notebook-app
bot
commented
Jun 8, 2020
|
View / edit / reply to this conversation on ReviewNB guneetmutreja commented on 2020-06-08T11:21:03Z Some grammatical changes fixes:
Recently there has been a great emphasis on reducing carbon footprint by moving away from fossil fuel to renewable energy sources for running our cities. Various local city governments across the world like in this case the City of Calgary in Canada is leading this change by becoming energy independent by installing solar power plants either on the rooftop or within the site area of their city utilities for running its operation. In view of the scenario here is a notebook that would compute the amount of energy a solar power plant would produce using weather variables at any such site and subsequently estimate the total capacity of the power plant required to satisfy its daily need.
Given a location in latitude and longitude, this notebook can predict the daily hence annual solar energy generation by a solar power station at the site. The hypothesis is that various weather parameters such as temperature, wind speed, vapor pressure, solar radiation, day length, precipitation, snowfall along with altitude of a place would impact the generation of solar energy for a certain day.
Accordingly, these variables are used to train a model on actual solar power generated by solar stations located in Calgary, Canada, which could then be used to predict solar generation for probable solar plants at other locations. Besides the total energy generation would also depend on the capacity of the solar station established. For example, a 100kwp solar plant will generate more energy than a 50kwp plant, hence for the final output, the capacity of the plant is to be taken into consideration. |
review-notebook-app
bot
commented
Jun 8, 2020
|
View / edit / reply to this conversation on ReviewNB guneetmutreja commented on 2020-06-08T11:21:04Z Some grammatical changes.
Out of the several solar photovoltaic power plants in the City of Calgary, 11 were selected for the study. The dataset contains two components: 1) Daily solar energy production for each power plant from September 2015 to December 2019. 2) Corresponding daily weather measurements for the given sites.
The datasets were obtained from multiple sources as mentioned here (Data resources) and preprocessed to obtain the main dataset used here. Two feature layer was subsequently created out of them.
The hyperlink to "Data Resources" does not take to intended location. |
review-notebook-app
bot
commented
Jun 8, 2020
|
View / edit / reply to this conversation on ReviewNB guneetmutreja commented on 2020-06-08T11:21:05Z Please add screenshot of the map. |
review-notebook-app
bot
commented
Jun 8, 2020
|
View / edit / reply to this conversation on ReviewNB guneetmutreja commented on 2020-06-08T11:21:05Z Some grammatical changes:
In the above table each row represents each day starting from September 2015 to December 2019, with the corresponding dates shown in the field Field1, and the field solar_plan gives names of the solar sites. |
review-notebook-app
bot
commented
Jun 8, 2020
|
View / edit / reply to this conversation on ReviewNB guneetmutreja commented on 2020-06-08T11:21:06Z Please add screenshot of the map. |
review-notebook-app
bot
commented
Jun 8, 2020
|
View / edit / reply to this conversation on ReviewNB guneetmutreja commented on 2020-06-08T11:21:07Z Once the training and the validation dataset is processed and analyzed, it is ready to be used for modeling. In this sample two types of methodology are used for modeling: 1) Fully Connected Network - First a deep learning framework called Fully Connected Network (fcn) available in the arcgis.learn module in ArcGIS API for Python is used. 2) Machine Learning Model - In the second option, one of the machine learning algorithms from scikit learn will be implemented via the MLModel framework available in arcgis.learn. This framework can deploy any ML algorithm from the scikit learn library just by passing the name of the algorithm and its relevant parameters as keyword arguments. Finally, performance between the two methods will be compared in terms of model training and validation accuracy. |
review-notebook-app
bot
commented
Jun 8, 2020
|
View / edit / reply to this conversation on ReviewNB guneetmutreja commented on 2020-06-08T11:21:08Z First, a list is made consisting of the feature data that will be used for predicting daily solar energy generation. By default, it will receive a continuous variable, while in case of a categorical variable the true value should be passed inside a tuple along with the variable. |
review-notebook-app
bot
commented
Jun 8, 2020
|
View / edit / reply to this conversation on ReviewNB guneetmutreja commented on 2020-06-08T11:21:09Z Here the suggested learning rate by the lr_find method was around 0.000575. The automatic lr_finder will take a conservative estimate of the learning rate, but some experts can interpret the graph more appropriately and find a better learning rate to be used for final training of the model. |
review-notebook-app
bot
commented
Jun 8, 2020
|
View / edit / reply to this conversation on ReviewNB guneetmutreja commented on 2020-06-08T11:21:10Z In the above table, the predicted values by the model on the test set in the last column named prediction_results and the actual values in the column named capacity_f of the target variable are highly similar.
Accordingly, the model metrics of the trained model is now estimated as follows: the mean absolute error score and r-square of the model fit is checked for the trained model. |
review-notebook-app
bot
commented
Jun 9, 2020
|
View / edit / reply to this conversation on ReviewNB guneetmutreja commented on 2020-06-09T05:26:03Z In the above table the predicted values by the model on the test set in the last column named prediction_results and the actual values in the column named capacity_f of the target variable are highly similar.
Accordingly, the model metrics of the trained model is now estimated as follows: the mean absolute error score and r-square of the model fit is checked for the trained model. |
review-notebook-app
bot
commented
Jun 9, 2020
|
View / edit / reply to this conversation on ReviewNB guneetmutreja commented on 2020-06-09T05:26:04Z
KeyError: "['prediction'] not in index" in the line: test_pred_datetime = test_pred_layer_sdf[['Field1','capacity_f','prediction']].copy()
Also, it will be good if we can merge some functions in one line itself like the two lines: test_pred_datetime = test_pred_datetime.drop(['date','capacity_f','prediction'], axis=1) test_pred_datetime = test_pred_datetime.sort_index()
can be merged into one as: test_pred_datetime = test_pred_datetime.drop(['date','capacity_f','prediction'], axis=1).sort_index() |
review-notebook-app
bot
commented
Jun 9, 2020
|
View / edit / reply to this conversation on ReviewNB guneetmutreja commented on 2020-06-09T06:58:59Z Can we mentioned the variable symbols also with their names in brackets in the above line like: The plot shows that variable of shortwave radiation per meter square (srad__W_m_) .... |
review-notebook-app
bot
commented
Jun 9, 2020
|
View / edit / reply to this conversation on ReviewNB guneetmutreja commented on 2020-06-09T06:59:00Z Can we have a little more detail of this here? I feel this is the first sample featuring FCN so a little more detail will be required. |
review-notebook-app
bot
commented
Jun 9, 2020
|
View / edit / reply to this conversation on ReviewNB guneetmutreja commented on 2020-06-09T06:59:00Z Please reduce the comment length or make it in next line to avoid scroll in the cells |
review-notebook-app
bot
commented
Jun 9, 2020
|
View / edit / reply to this conversation on ReviewNB guneetmutreja commented on 2020-06-09T06:59:01Z Are we checking MAE also? If so, please add that too as the code below has only R-squared value |
review-notebook-app
bot
commented
Jun 9, 2020
|
View / edit / reply to this conversation on ReviewNB guneetmutreja commented on 2020-06-09T06:59:02Z # the model.score method from the tabular learner returns mean squared error |
review-notebook-app
bot
commented
Jun 9, 2020
|
View / edit / reply to this conversation on ReviewNB guneetmutreja commented on 2020-06-09T06:59:02Z "The table above returns the predicted values for the Southland photovoltaic power plant stored in the field called prediction which has the model estimated daily capacity factor of energy generation, whereas the actual capacity factor is in the field named capacity_f. "
I saw prediction_results instead of prediction.
|
review-notebook-app
bot
commented
Jun 9, 2020
|
View / edit / reply to this conversation on ReviewNB guneetmutreja commented on 2020-06-09T06:59:03Z It seems seaborn does not come as default package with installation.
I would including a cell with the command to install seaborn just before these lines:
|
review-notebook-app
bot
commented
Jun 9, 2020
|
View / edit / reply to this conversation on ReviewNB guneetmutreja commented on 2020-06-09T06:59:04Z Similar to the data preparation process for the neural network, first a list is made consisting of the feature data that will be used for predicting daily solar energy generation. By default, it will receive continuous variable, otherwise for a categorical variable the true value should be passed inside a tuple along with the variable. These variables are then transformed by the RobustScaler function from scikit learn by passing it along with the variable list into the column transformer function as follows: |
review-notebook-app
bot
commented
Jun 9, 2020
|
View / edit / reply to this conversation on ReviewNB guneetmutreja commented on 2020-06-09T06:59:05Z The input parameters required for the tool are similar to the ones mentioned previously : |
review-notebook-app
bot
commented
Jun 9, 2020
|
View / edit / reply to this conversation on ReviewNB guneetmutreja commented on 2020-06-09T06:59:06Z Finally, the model is now ready for training, and the model.fit method is used for fitting the machine learning model with its defined parameters mentioned in the previous step. |
review-notebook-app
bot
commented
Jun 9, 2020
|
View / edit / reply to this conversation on ReviewNB guneetmutreja commented on 2020-06-09T06:59:07Z Subsequently the model metrics of the trained model is now estimated as follows: the mean absolute error score and r-square of the model fit is checked for the trained model. Currently the model.score() function gives the r-square, while the mean squared error is obtained using scikit learn metrics. |
review-notebook-app
bot
commented
Jun 9, 2020
|
View / edit / reply to this conversation on ReviewNB guneetmutreja commented on 2020-06-09T06:59:08Z The low MSE and high r-square value indicates that the model has been trained well, and as well this model achieved a higher r-square and a lower MSE compared to the previous fully connected network model. |
review-notebook-app
bot
commented
Jun 9, 2020
|
View / edit / reply to this conversation on ReviewNB guneetmutreja commented on 2020-06-09T06:59:08Z The trained RandomForestRegressor model implemented via the MLModel will now be used to predict the daily lifetime solar energy generation for the solar plant installed at the Southland Leisure Centre similarly since it was installed during 2015. The aim is to compare and validate its performance as obtained by the FCN model previously. |
review-notebook-app
bot
commented
Jun 9, 2020
|
View / edit / reply to this conversation on ReviewNB guneetmutreja commented on 2020-06-09T06:59:09Z This shows error for me: KeyError: "['prediction'] not in index" Request you to have a look. |
review-notebook-app
bot
commented
Aug 6, 2020
|
View / edit / reply to this conversation on ReviewNB BP-Ent commented on 2020-08-06T22:23:47Z In the plots above, it can be seem that each of the variables has a high seasonality, and it seems that there is a relationship between the dependent variable kWh_filled and the explanatory variables. As such, a correlation plot should be created to check the correlation between the variables. |
review-notebook-app
bot
commented
Aug 6, 2020
|
View / edit / reply to this conversation on ReviewNB BP-Ent commented on 2020-08-06T22:23:48Z The resulting correlation plot shows that the variable of shortwave radiation per meter square (sradW_m_) has the largest correlation with the dependent variable of total solar energy produced expressed in terms of capacity factor (capacity_f). This is followed by the variable of day length (dayls_), as longer days are likely to produce more solar energy. These two are closely followed by max (tmaxdeg) and min (tmindeg) daily temperatures, and lastly the remaining variables with weaker correlation values. |
review-notebook-app
bot
commented
Aug 6, 2020
|
View / edit / reply to this conversation on ReviewNB BP-Ent commented on 2020-08-06T22:23:48Z The validation set consists of daily solar generation data from September 2015 to December 2019 for one solar site, known as Southland Leisure Centre, and will be used to validate the trained model. |
review-notebook-app
bot
commented
Aug 6, 2020
|
View / edit / reply to this conversation on ReviewNB BP-Ent commented on 2020-08-06T22:23:49Z Model Building
Once the training and the validation datasets have been processed and analyzed, they are ready to be used for modeling. In this sample, two methods are used for modeling: 1) 2) Finally, performance between the two methods will be compared in terms of model training and validation accuracy. Further details on |
review-notebook-app
bot
commented
Aug 6, 2020
|
View / edit / reply to this conversation on ReviewNB BP-Ent commented on 2020-08-06T22:23:50Z This is an Artificial Neural Network model from the |
review-notebook-app
bot
commented
Aug 6, 2020
|
View / edit / reply to this conversation on ReviewNB BP-Ent commented on 2020-08-06T22:23:51Z Data Preprocessing
First, a list is made that consists of the feature data that will be used for predicting daily solar energy generation. By default, it will receive continuous variables, and in the case of categorical variables, the True value should be passed inside a tuple along with the variable. In this example, all of the variables are continuous. |
review-notebook-app
bot
commented
Aug 6, 2020
|
View / edit / reply to this conversation on ReviewNB BP-Ent commented on 2020-08-06T22:23:51Z Once the explanatory variables are identified, the main preprocessing of the data is carried out by the The input parameters required for the tool are:
|
review-notebook-app
bot
commented
Aug 6, 2020
|
View / edit / reply to this conversation on ReviewNB BP-Ent commented on 2020-08-06T22:23:52Z Once the data has been prepared by the |
review-notebook-app
bot
commented
Aug 6, 2020
|
View / edit / reply to this conversation on ReviewNB BP-Ent commented on 2020-08-06T22:23:53Z Model Training
Finally, the model is now ready for training. To train the model, the |
review-notebook-app
bot
commented
Aug 6, 2020
|
View / edit / reply to this conversation on ReviewNB BP-Ent commented on 2020-08-06T22:23:54Z The train_loss an valid_loss fields are plotted to check whether the model is over-fitting. The resulting plot shows that the model has been trained well and that the losses are gradually decreasing, but not significantly. |
review-notebook-app
bot
commented
Aug 6, 2020
|
View / edit / reply to this conversation on ReviewNB BP-Ent commented on 2020-08-06T22:23:54Z In the table above, the values predicted by the model when applied to the test set, prediction_results, are similar to the actual values of the test set, capacity_f.
As such, the model metrics of the trained model can now be estimated using the |
review-notebook-app
bot
commented
Aug 6, 2020
|
View / edit / reply to this conversation on ReviewNB BP-Ent commented on 2020-08-06T22:23:55Z Solar Energy Generation Forecast & Validation
The trained model( Accordingly, the |
review-notebook-app
bot
commented
Aug 6, 2020
|
View / edit / reply to this conversation on ReviewNB BP-Ent commented on 2020-08-06T22:23:56Z The table above returns the predicted values for the Southland photovoltaic power plant stored in the field called prediction_results , which holds the model estimated daily capacity factor of energy generation, whereas the actual capacity factor is in the field named capacity_f. The capacity factor is a normalized value that will be rescaled back to the original unit of KWh by using the peak capacity of the Southland photovoltaic power plant of 153KWp. |
review-notebook-app
bot
commented
Aug 6, 2020
|
View / edit / reply to this conversation on ReviewNB BP-Ent commented on 2020-08-06T22:23:57Z The comparison returns a high r-square of 0.86, showing a high similarity between the actual and predicted values. |
review-notebook-app
bot
commented
Aug 6, 2020
|
View / edit / reply to this conversation on ReviewNB BP-Ent commented on 2020-08-06T22:23:57Z Summarizing the values, the actual average annual energy generated by the solar plant is 170.03 MWh, which is close to the predicted annual average generated energy of 170.08 MWh, indicating a high level of precision. |
review-notebook-app
bot
commented
Aug 6, 2020
|
View / edit / reply to this conversation on ReviewNB BP-Ent commented on 2020-08-06T22:23:58Z In the plot above, the blue line represents the actual generation values, and the orange line represents the predicted generation values. The two show a high degree of overlap, indicating that the model has a high predictive capacity. |
review-notebook-app
bot
commented
Aug 6, 2020
|
View / edit / reply to this conversation on ReviewNB BP-Ent commented on 2020-08-06T22:23:59Z 2 - MLModel
In the second method, a machine learning model is applied to model the same data using the MLModel framework from |
review-notebook-app
bot
commented
Aug 6, 2020
|
View / edit / reply to this conversation on ReviewNB BP-Ent commented on 2020-08-06T22:23:59Z Data Preprocessing
Like the data preparation process for the neural network, first a list is made consisting of the feature data that will be used for predicting daily solar energy generation. By default, it will receive continuous variables, whereas for categorical variables, the True value should be passed inside a tuple along with the variables. These variables are then transformed by the RobustScaler function from scikit-learn by passing it, along with the variable list, into the column transformer function as follows: |
review-notebook-app
bot
commented
Aug 6, 2020
|
View / edit / reply to this conversation on ReviewNB BP-Ent commented on 2020-08-06T22:24:00Z Once the explanatory variables list is defined and the precrocessors are computed, they are now used as input for the The input parameters required for the tool are similar to the ones mentioned previously: |
review-notebook-app
bot
commented
Aug 6, 2020
|
View / edit / reply to this conversation on ReviewNB BP-Ent commented on 2020-08-06T22:24:01Z Model Initialization
Once the data has been prepared by the |
review-notebook-app
bot
commented
Aug 6, 2020
|
View / edit / reply to this conversation on ReviewNB BP-Ent commented on 2020-08-06T22:24:02Z In the table above, the last column, capacity_f_results, returns the values predicted by the model, which are similar to the actual values in the target variable column, capacity_f.
Subsequently, the model metrics of the trained model are now estimated using the |
review-notebook-app
bot
commented
Aug 6, 2020
|
View / edit / reply to this conversation on ReviewNB BP-Ent commented on 2020-08-06T22:24:03Z Solar Energy Generation Forecast & Validation
The trained GradientBoostingRegressor model, implemented via the To reiterate, the |
review-notebook-app
bot
commented
Aug 6, 2020
|
View / edit / reply to this conversation on ReviewNB BP-Ent commented on 2020-08-06T22:24:04Z The table above returns the The capacity factor is a normalized value that will be rescaled back to the original unit of KWh by using the peak capacity of the Southland photovoltaic power plant of 153KWp. |
review-notebook-app
bot
commented
Aug 6, 2020
|
View / edit / reply to this conversation on ReviewNB BP-Ent commented on 2020-08-06T22:24:04Z The comparison returns a high R-squared of 0.84, indicating a high similarity between the actual and predicted values. |
review-notebook-app
bot
commented
Aug 6, 2020
|
View / edit / reply to this conversation on ReviewNB BP-Ent commented on 2020-08-06T22:24:06Z Summarizing the values, the actual average annual energy generated by the solar plant is 170.03 MWh, which is close to the predicted annual average generated energy of 171.48 Mwh. This indicates a high level of precision. |
review-notebook-app
bot
commented
Aug 6, 2020
|
View / edit / reply to this conversation on ReviewNB BP-Ent commented on 2020-08-06T22:24:06Z The goal of this project was to create a model that could predict the daily solar energy efficiency, the actual output of photovoltaic solar energy, of a location using the daily weather variables of the site as inputs, thereby demonstrating the newly implemented artificial neural network of Accordingly, data from 10 solar energy installation sites in the City of Calgary in Canada were used to train two different models — the first being the Comparison of the result shows that both models successfully predicted the solar energy output of the test solar plant with predicted values of 171.76 MWh and 171.51 MWh by the Finally, to expand on this model further in the furture, it would be interesting to apply this model to other solar generation plants located across different geographies and to record its performance to understand the generalizability of the model. |
|
Suggested changes noted in ReviewNB. |
|
#782 duplicate |
|
@BP-Ent Thanks for the review. Your suggestions are incorporated and a new PR is created. |
moonlanderr commentedMay 13, 2020
solar energy prediction
Checklist
Please go through each entry in the below checklist and mark an 'X' if that condition has been met. Every entry should be marked with an 'X' to be get the Pull Request approved.
imports are in the first cell? First block of imports are standard libraries, second block are 3rd party libraries, third block are allarcgisimports? Note that in some cases, for samples, it is a good idea to keep the imports next to where they are used, particularly for uncommonly used features that we want to highlight.GISobject instantiations are one of the following?gis = GIS()gis = GIS('https://www.arcgis.com', 'arcgis_python', 'P@ssword123')gis = GIS(profile="your_online_profile")gis = GIS('https://pythonapi.playground.esri.com/portal', 'arcgis_python', 'amazing_arcgis_123')gis = GIS(profile="your_enterprise_portal")./misc/setup.pyand/or./misc/teardown.py?<img src="base64str_here">instead of<img src="https://some.url">? All map widgets contain a static image preview? (Callmapview_inst.take_screenshot()to do so)os.path.join()? (Instead ofr"\foo\bar",os.path.join(os.path.sep, "foo", "bar"), etc.)