A DBAs Journey to APEX - Part 1: From Napkin to Application In a Single Weekend
-
Posted by Harry E Fowler
- Last updated 7/24/19
- Share
By Jim Czuprynski | Edited by April Sims
The frenetic pace of application development in modern IT organizations means it’s not unusual to demand an application be built with minimal requirement gathering—literally, from a napkin-based sketch—to a working first draft of the app within extremely short time frames—even a weekend!—with production deployment to follow just a few days later.
This article – the first in a series – demonstrates a real-life application development scenario: the creation of a mobile application that gives election canvassers a tool to identify, classify and inform voters in a huge suburban Chicago voting district – using the latest Oracle application development UI, data modeling tools, and database technology. Along the way, we’ll show how Oracle APEX makes short work of building a working application while the Oracle DBA leverages her newest tools—SQL Developer and Data Modeler—to build a secure, reliable, scalable application for her development team.
Ultimately, All Politics Are Local
My wife recently retired from almost 20 years’ employment in IT. I’m happy to report that she successfully retired from building better applications through her roles as a QA technician, QA lead, and eventually as a senior project manager for a major multi-national insurance company. She finally had a chance to indulge her true passions—biology and citizen science—through her volunteer activities at the world-renowned Morton Arboretum near our home in Lisle, IL. But as we both adjusted to our new roles, I have to admit I became flustered at her frustrations about the current political situation in our home state, most especially in our Congressional district.
So when my wife announced she was planning to volunteer for the campaign to elect a first-time politician, scientist, and businessman running for Congress, I enthusiastically supported her efforts. We decided to volunteer our combined IT expertise and organizational skills to the campaign’s Data Committee.
Turning Napkin Scribblings Into Application Requirements
We soon recognized that from an IT perspective, the campaign needed some professional help. In our first few meetings with the campaign’s resident members, I made some simple sketches—quite literally, on napkins!—as we met with the campaign’s leaders (Figure 1). We identified several needs almost immediately, including what I considered the most crucial one: the ability to identify, track and canvass voters most likely to vote for the campaign’s candidate among the approximately 275,000 registered voters in the district.
After a good night’s sleep, I turned my attention to translating my napkin-based scribblings into some simple yet practical mental mapping of what would be needed from a data modeling perspective. First, it was pretty obvious that we had five major categories of data:
- Staffers: People who had volunteered to staff the campaign
- Campaign Organization: A hierarchical organization structure that established which staffer was fulfilling which role at any particular time during the campaign, as well as which staffers were responsible for managing other staffers in their roles in that hierarchy—for example, which staffer “wrangler” would be coordinating and tracking the activities of which people assigned to performing voter canvassing within the congressional district
- Voters: The pool of registered voters in the
- Voting History: A record of how a registered voter had voted in recent election(s), and (if available) indicators as to how they might be expected to vote for our candidate in the upcoming mid-term election
- Canvassing: Finally, the true fruit of the campaign staff’s ultimate purpose—voters who had been identified as interested in voting for our candidate.
Most of the information about our campaign staffers was already available and being captured and stored within Excel spreadsheets, JSON documents, or simple delimited files on a single workstation at campaign headquarters. And yes, my hair did turn just a bit greyer whenever I thought about the security aspects involved, should any of these personally-identifying information (PII) data be accidentally leaked to our adversary’s campaign staff. One major gap that we needed to address almost immediately: We needed to quickly gather, organize, cleanse and load data for potential registered voters and their voting history from public records.
And yes, believe it or not, the voting record of every registered voter is publicly available if the request for that data is routed through proper channels. Both major political parties are privy to these data; in the case of our candidate’s Congressional campaign, an organization known as NGP VAN that is based in Washington, D.C. provides a central clearinghouse for retrieval of data about individual registered voters and their voting history. To be clear, the only information available is whether or not a registered voter voted in past elections, but not how (s)he voted; the sacred trust of secret voting is still preserved in the United States.
Application Requirements: Further Discovery
As we continued to ask questions during discussions with senior campaign staff, several other obvious needs for the application were uncovered, including the following requirements:
- Building a list of volunteers already working on the campaign, as well as a way to track and engage voters to become staff volunteers if they so desired
- Assigning a volunteer to one or more campaign roles like CANVASSER, PHONEBANKER, and so forth
- Assigning roles within a hierarchy of the campaign organization so all staffers would know their team’s tasks and accomplishments, as well as informing volunteers to whom they should report their progress, travails, and concerns
- Reporting and summarizing the progress of staffers as they fulfilled the role of canvasser when they contacted voters, thus enabling their leaders to decide where any gaps in voter coverage may have occurred, as well as which canvassers might need remedial training or reassignment if they were failing in that role
- Mapping the progress of canvassers versus geography covered, especially as the November 6th Election day drew nearer and nearer—if possible, in some sort of graphical format
Finally, senior staffers demanded that these capabilities were fully available from just about anywhere—whether using a desktop computer, laptop, tablet, or smartphone—including the ability to track progress via a summary dashboard that shows daily, weekly and monthly targets for potential voters contacted versus those committing to vote for our candidate.
The good news, of course, is that we already had everything we needed to fulfill these needs: Oracle Database 18c, Oracle SQL Developer and Oracle Application Express (APEX). Moreover, I determined to leverage this challenge to put my money where my mouth was: I would make the shift from old-school Oracle DBA to a true DevOps DBA.
The DevOps DBA: Building a Logical Data Model In Minutes
I have to admit that even before I’d transferred the mayonnaise-stained napkin notes from our lunchtime get-together to electronic format, I had already envisioned the tables that would encompass the data for the application. I’d even already come up with a name for the application schema based on the nefarious politics for which my home town, Chicago, is notorious: VEVO (for Vote Early, Vote Often). I’ve summarized the tables and their expected contents below in Table 1.
Table 1. Summary of VEVO Database Tables
Table Name
Description
T_VOTERS
Detailed information about the universe of registered voters within the U.S. Congressional District
T_VOTING_RESULTS
Details of each registered voter’s voting history during past presidential elections, mid-term elections, and primary contests
T_STAFF
Detailed personal information about people who have volunteered to staff the election campaign
T_CAMPAIGN_ORG
A historical hierarchy of which staff volunteer has been assigned responsibility to manage other volunteer(s) on the campaign, including the role that each staff has fulfilled at a particular point in time during the campaign
T_CANVASSING
A historical record of the canvassing activity that each volunteer has performed as they contacted voters, including any voting affinity
I rushed ahead to create the DDL scripts to build new tablespaces for the application schema itself, the tables, their corresponding primary key constraints, and indexes. (You can see that code in Listing 1 at the end of this article.) Then I realized I didn’t have the most important artifact that I’d need to enable my DevOps-oriented application development team: an entity-relationship diagram (ERD) that describes the entities and their corresponding relationships.
Historical Note. The origin of the cynical phrase “Vote early — and vote often” has been attributed to various sources, including the notorious Chicago gangster Al Capone and various politicians in New York City’s Tammany Hall. However, its original intent was to remind voters to vote early in the day to avoid long lines (and possible poll closings) as well as spurring them to vote in every election for which they were enfranchised. Recently, the phrase has been updated to reflect that meme: “Vote early, and vote often … just not in the same election.” The origin of the cynical phrase “Vote early — and vote often” has been attributed to various sources, including the notorious Chicago gangster Al Capone and various politicians in New York City’s Tammany Hall. However, its original intent was to remind voters to vote early in the day to avoid long lines (and possible poll closings) as well as spurring them to vote in every election for which they were enfranchised. Recently, the phrase has been updated to reflect that meme: “Vote early, and vote often … just not in the same election.”
Fortunately, I had the right tool at my fingertips: Oracle SQL Data Modeler, the completely free (and yet quite powerful and flexible!) application available from Oracle. I downloaded the latest version from Oracle Technology Network (OTN) and started the application (Figure 2).
After selecting my previously-created DDL script as the source of information to import (Figure 3), I specified that Data Modeler should use Oracle Database12.2 for its feature set specification (Figure 4). (Remember, Oracle 18c is essentially Oracle 12.2.0.2.)
Once the data modeling information was successfully validated and accepted into Data Modeler, I was able to create and save a logical data model chart by rolling over the diagram, clicking the right mouse button, and selecting the Print Diagram … To Image File option to create a file in .PNG format that’s shown in Figure 5. Of course, there’s a lot more that I can do with Data Modeler, but we’ll save those explorations when we explore its features in later articles in this series.
Flexibility
Since I was concerned about the security of any PII for both campaign staffers as well as voters, I decided that the Oracle Public Cloud (OPC) would be the best possible location to construct the underlying database infrastructure for the VEVO applications my team would be developing. I made that decision based on the end-to-end encryption capabilities of Oracle SQL*Net to protect data during both database ingress and egress; in addition, data at rest is automatically protected because OPC employs AES 256-bit encryption to encrypt all tablespaces of every Cloud-resident database.
I won’t go into detail about the steps I followed to create the new database in the OPC because there are numerous examples already available of how to build a new database using that technology; here’s one directly from the Oracle Learning Library.
Here are the essential steps I undertook:
- First, I created new a new Oracle 18c database instance named ORA182, allocating two (2) OCPUs and an additional 100GB of storage to that database instance. I also requested it be created as an Enterprise Edition – Extreme Performance environment so I could experiment with additional pertinent 18c features during application development.
- During its creation, I requested the inclusion of the sample schemas as well as the creation of a new PDB named PDBVEVO.
- I then added the two new tablespaces, VEVO_DATA and VEVO_IDX, into the PDBVEVO PDB.
- I created the new VEVO schema within the PDBVEVO PDB, making the VEVO_DATA tablespace the default tablespace for that pluggable database.
- Finally, I created the five tables I had previously identified within the new VEVO schema.
Loading Sample Dimensions: Obfuscation Needed
My contacts within the campaign were able to provide me with sample data for the T_STAFF and T_CAMPAIGN_ORG tables. To ensure complete anonymity of my fellow campaign staffers’ identities, however, I regenerated data in CSV format via purely manual means for those entities, and then used the inherent capabilities of SQL Developer to load that data.
Figure 6 shows how I used SQL Developer to load the T_STAFF table. Note that the first row of the T_STAFF.CSV file included the column names for that table, so I checked off the Header option during the Data Import Wizard dialog.
Loading Sample Fact Tables: Random Data Generation Required
Unfortunately, my Data Committee contacts informed me that they were unable to supply me with sample voter, voting history, or canvassing data because of concerns about the privacy of voter and staff personal data – and I concurred with their concerns.
To facilitate quicker development efforts, I simply generated my own data for the T_VOTERS table via Dominic Giles’s often-overlooked DataGenerator tool, available via free download here. Listing 2 shows an abbreviated version of the complete configuration file in XML format I used to create 275,000 voter entries via Data Generator; the log file from generating those data is shown in Listing 3.
One of the things I really like about using DataGenerator for these types of tasks is that it automatically generates—among other things!—the SQL*Loader control file for the table. I leveraged that generated control file shown in Listing 4 to invoke an SQL*Loader session and add these newly-generated data using parallelized direct path load into the T_VOTERS table, as shown in Listing 5.
To generate sufficiently-random voter participation data and canvassing data for the T_VOTING_RESULTS and T_CANVASSING tables, respectively, I wrote a relatively simple PL/SQL package, PKG_LOAD_GENERATOR. (See Listing 6 for that package specification and body.) When invoked, it generates a random number of voting history entries and canvassing entries for each of the 275,000 voters in T_VOTERS. With all data population finally complete, here are the final row counts for all tables, generated from within SQL Developer SQL Worksheet:
Finally, using the last half of the schema generation script shown in Listing 1, I created all the unique and foreign key constraints for all of the VEVO schema’s tables and then gathered statistics just for the VEVO schema using procedure GATHER_SCHEMA_STATS of the DBMS_STATS utility package.
Next Time: Building a Production-Ready, Mobile-Enabled Application with APEX
We’ve accomplished quite a bit already! To recap our progress so far:
- We’ve got our application’s initial data model defined, including an ERD for its logical model.
- Our development database has been created in the Oracle Public Cloud as an Oracle 18c PDB.
- We’ve loaded sufficient sample data into our database using several different methods so our application development team can start experimenting against realistic data as they build out our application’s initial version.
The next article in this ongoing series will show just how easy it is to leverage Oracle APEX to build a sufficiently-robust application for election canvassers to identify, inform, and classify potential registered voters. Stay tuned – it’s going to be an interesting ride!
About the Author
Jim Czuprynski has nearly four decades of professional experience in his career in information technology. He has served diverse roles at several Fortune 1000 companies—mainframe programmer, applications developer, business analyst, and project manager—before becoming an Oracle DBA in 2001. Jim was awarded the status of Oracle ACE Director in March 2014 and is a sought-after public speaker on Oracle Database technology features, presenting often at Oracle OpenWorld, IOUG COLLABORATE, ODTUG Kaleidoscope, Hotsos Symposium, Oracle Development Community tours, and Oracle User Group conferences around the world.
Jim has authored over 100 articles focused on facets of Oracle Database administration to his credit since 2003 at databasejournal.com and IOUG SELECT. He has also co-authored four books on Oracle database technology.
Jim’s blog, Generally … It Depends, contains his regular observations on all things Oracle.
Editor’s note: For full listings mentioned in this article, please see the links below. Questions or interested in more? Please reach out to the IOUG editor, Kristin Fields, at select@ioug.org.
Listing 1. DDL for VEVO Schema Creation
Listing 2: Parameter file (abbreviated) for generating 275,000 rows via Data Generator utility for eventual population of the T_VOTERS table
Listing 3: Output from Data Generator T_VOTERS example data generation
Listing 4: T_VOTERS SQL*Loader Control File
Listing 5: T_VOTERS SQL*Loader Log File
Listing 6: PKG_LOAD_GENERATOR Contents