Let’s say I wanted to pass a list of UserIDs to a stored procedure that is using a CASE WHEN statement to update our Test table. I want to update the column Approval and I want to set the statuses of <Reject> to <Rejected> and <Approve> to <Approved> when I pass the [UserID] and [Approval] column to the stored procedure. If I pass a UserID and a value not <Approve> or <Reject>, it should be ignored.
Puzzles can be fun or frustrating, but I like them because it can be a great way to showcase who you are and how you work. For example, I was given a puzzle from a company that was interested in me but wanted to know how I approached specific problems. They gave me a simple SQL query and three word problems to go along with it. My reply back illustrated how I approach problems, document findings, performance test, and also work to obtain business requirements.
I gave them back a long paper that broke down the primary problem into nine different scenarios for how to interpret their question and I wrote a total of 35 queries spanning all nine scenarios to pick the best option. This included a procedure they could run to test these scenarios and see my work.
I’m going to preface by saying, I ran into a specific scenario and I strongly suggest you see if this scenario is what’s currently affecting you before you try these troubleshooting steps.
Shane also describes for us what this term means in our context:
Dipping into the Cookie Jar is about when the going gets tough and you don’t think you can handle anymore, then you think back about your accomplishments and take some sustenance from them. You dip back into that cookie jar and use whatever energy that provides to keep going.
If you are seeing an error similar to “Conversion failed when converting the varchar value ‘something’ to data type int.”, I find the easiest method for identifying a solution to the problem is to analyze the data causing the error.
There are many myths and misconceptions regarding TRUNCATE in SQL Server. I think this is heavily influenced by how other Relational Database Management Systems (RDBMS) handle TRUNCATE. Continue reading “Truncate vs Delete”
I wanted to take a moment and reflect on 2018, it was an incredibly successful year for me and I want to thank everyone who has helped make that possible. The SQL community is always responsive and helpful, my co-workers are wonderful, and of course; my family and friends are always supportive and understanding. So thank you everyone for helping me hit my goals!
Something I’ve automated* recently was my import of Toggl data! So T-SQL Tuesday #110 seems like a great time to post this solution. Thank you Garry Bargsley for the great idea and being this months host! I put an asterisks next to automated, as obtaining the file from Toggl and placing its location is not … Continue reading “T-SQL Tuesday #110 Automate your Toggl dataset with SSIS”
I put an asterisks next to automated, as obtaining the file from Toggl and placing its location is not automatic. They provide a premium feature to allow this or you could work on creating your own web scraper. (Which is probably against their rules of use on their site, since they are offering the file drop as a premium feature.)
So while that piece is not automatic, the ingestion, cleanup, and archive of data from the Toggl file is. Generally when I do automate a task, I will use a combination of Powershell, SSIS, BIML, Python, T-SQL, Bash, and whatever else may be necessary to get the job done.
Steve Jones got me hooked on Toggl back in July and I found it pretty easy to use. Since then I’ve begun creating analytics on tasks at my job and also in my life. I log my study time, blog work, and other extra curricular so I can begin to cultivate a data set about how I use my time.
Another thought, is that you can use Toggl with analytics to give your stakeholders insight into how your time is spent. Maybe you need to be left out of meetings because 25% of your days is spent in them? Maybe you have been doing too many projects for Todd from accounting? This gives your stakeholders a true insight into your day using tools to easily capture and identify your pain points which can be optimized.
Depending on your Toggl setup, you may not want to be very detailed with the information you provide. I personally normalize my data which is input into Toggl so it’s meaningless numbers without my master map which resides on my local computer. I want to keep my clients data secure and this information is hosted on Toggl’s servers. So that’s some food for thought, be mindful of what you are inputting into this online tool and what you’ve signed as far as Non-disclosure agreements.
So onto the SSIS package!
DISCLAIMER!!!
This should be done in a personal or safe environment, never production!
Created with Visual Studios 2017 Version 15.8.5, .NET Framework V 4.7.03056.
DBMS is Microsoft SQL Server 2017 (RTM) – 14.0.1000.169 (X64)
What does this SSIS package do? Well very simply, you export your detailed year view of Toggl (which is typically a pretty small data set, at least from my perspective), place it in a location, set your SSIS parameter for the folder location, create a new database or use an existing one and set it in the project connections, and let it rip.
Here is how you get your data from Toggl:
Here is a link to my Github Project: https://github.com/Jonathanshaulis/Toggl_Import , so you can either fork or download the SSIS files. Place them somewhere on your computer where you want to store SSIS packages.
Explanation of the SSIS package
The package starts by checking for the existence of the archive and staging schemas, it will create them if they don’t exist. It will then do the same for the tables, we have the primary table we use, the staging table, and the archive table for when you have years of data stockpiled over time.
We’ll truncate staging and then load the staging table immediately after. Using the base data set from Toggl, I found it difficult to analyze the summed durations, so I create a new column on import that takes the duration and turns it into a decimal. This made it more convenient for me to use Power BI to report off of.
Also in the Data Flow (DFT – File to Staging) we have two file counts occurring, one counts errors and the other counts records that flowed to the destination.
Errors are logged to DFTStagingCountErr and the successful imports are logged to DFTStagingCount.
Afterward, we gather a count of the rows in the staging table and assign the count to the variable ESTStagingCount.
There is a condition check after the count that will stop the process if the two variables DFTStagingCountErr and DFTStagingCount added together do not match the count of the variable ESTStagingCount. This is important because it ensures a successful import has occurred before it continues onward and truncates our primary table to perform a full load.
Once we are sure all records made it to our staging table, we truncate our destination and move our file to our “production” table. This gives a complete refresh of the data for the year, but this also serves a conundrum.
What happens when we have two years of data?
The final step archives the data set that you’ve imported.
The query performs a “merge upsert” with a not matched deletion at the end.
If a row matches exactly between the archive table and production table, it will update the Toggl_ID to keep the latest version of that row.
If a row is unmatched on all columns, it will interpret that as a new row and insert it.
If a row is unmatched on all columns and the date of the row matches the max date found in the production table, it will delete out the row. This logic is to allow for data changes to flow to the archive table making the hosted Toggl data source the golden data source. This means rows in 2018 would not be deleted from the archive source once you import a 2019 data set, but it would delete out a 2019 row if you changed the data and it would import the newly corrected row.
So how do I run it when I have it open?
You’ll want to edit the Local connection manager to point to your instance. I would recommend to create a new database, something like “Career”. (I will be uploading more blog posts regarding Brent Ozar’s Salary Survey and creating a database portfolio in SSIS using the Career DB later.)
The last modification you need to make is to adjust the value for the Parameters Property File_Location. It doesn’t matter if you include the final backslash or not at the end of the string, I have error handling in an expression to make sure it is able to take either variation. Just make sure your file is in that location!
Voila! That’s it!
Download SSIS package / Toggl File.
Place SSIS package and Toggl File in ideal locations.
Open SQL Server, create a new Database called Career or anything else you choose, or just use an already existing database.
Open the SSIS package, adjust the Local connection to your instance and database.
Edit the Parameter for the File_Location and point it to your file.
F.A.Q. and Troubleshooting
If you have questions or run into issues, please let me know and I’ll see what I can do to help troubleshoot. I’ll modify this section as time goes on and as changes are needed.