A typical ODA project (and why I love Oracle Database Appliance)

December 8, 2020, 9:47 am

≫ Next: JENKINS install Jenkins on Windows

≪ Previous: Aurora Serverless v2 (preview) – CPU

Introduction

You can say everything about Oracle infrastructure possibilities but nothing compares to experience based on the realization of real projects. I did quite a lot of ODA projects during the past years (on other platforms too), and I would like to tell you why I trust this solution. And for that, I will tell you the story of one of my latest ODA project.

Before choosing ODAs

A serious audit of the current infrastructure is needed, don’t miss that point. Sizing the storage, the memory, the licenses takes several days, but you will need that study. You cannot take good decisions being blind.

The metrics I would recommend to collect for designing your new ODA infrastructure are:
– the needed segregation (PROD/UAT/DEV/TEST/DR…)
– the DBTime from actual databases consolidated for each group
– the size and growth forecast for each database
– the memory usage for SGA and PGA, and the memory advised by Oracle inside each database
– an overview of future projects coming in the next years
– an overview of legacy applications that will stop to use Oracle in the next years

Once done, this audit will help you to choose how many ODAs, which type of ODA, how many disks on each ODA and so on. For this particular project, the audit was done several months before and led to an infrastructure composed of 6 X8-2M ODAs with various disk configurations.

Before delivery

I usually provide an Excel file to my customer for collecting essential information for the new environment:
– hostnames of the servers
– purpose of each server
– public network configuration (IP, netmask, gateway, DNS, NTP, domain)
– ILOM network configuration (IP, netmask, gateway – this is for the management interface)
– additional networks configuration (for backup or administration networks if additional interfaces have been ordered)
– the DATA/RECO repartition of the disks
– the target uid and gid for Linux users on ODA
– the version(s) of the database needed
– the number of cores to enable on each ODA (if using Enterprise Edition – this is how licenses are configured)

With this file, I’m pretty sure to have almost everything before the server are delivered.

Delivery day

Once the ODAs are delivered, you’ll first need to rack them into your datacenters and plug the power and network cables. If you are well organized, you’ll have done the network patching before and the cable would be ready to plug in. Racking an ODA S and M is easy, less than one hour. Even less if you’re used to racking servers. For ODA HA it’s a little bit more complicated, because 1 ODA HA is actually 2 servers and 1 DAS storage enclosure, or 2 DAS enclosures if you ordered the maximum disk configuration. But these are normal servers, and it shouldn’t be too long or too complicated.

1st Day

The first day is an important day because you can do a lot if everything is prepared.

You’ll need several zipfiles to download from MOS to deploy an ODA, and these zipfiles are quite big, 10 to 40GB depending on your configuration. Don’t wait to download them from the links in the documentation. You can choose an older software version than current one, but commonly it’s a good idea to deploy the very latest version. You need to download:
– the ISO file for bare metal reimaging
– the GI clone
– the DB clones for the database versions you planned to use
– the patch files, because you probably need them (we will see why later)

During the download, you can connect to the ILOM IP of the server and configure static IP address. Gathering the IP of the ILOM will need the help from the network/system team, because each ILOM first looks up for a dynamic IP from DHCP, therefore you need to change it.

Once everything is downloaded, you can start the reimaging of the first ODA from the ILOM of the server, then a few minutes later on the second one if you’ve got multiple ODAs. After the reimaging, you will use the configure-firstnet script on each ODA to configure basic network settings on public interface to be able to connect to the server itself.

Once the first ODA is ready, I usually prepare the json file for appliance deployment from the template and according to the settings provided in the excel file. It takes me about 1 hour to make sure nothing is wrong or missing, and then I start the appliance creation. I always create the appliance with a DBTEST database to make sure everything is fine up to database creation. During the first appliance creation, I copy the json file to the other ODA, change the few parameters that differ from the previous one, and also start the appliance creation.

Once done, I deploy the additional dbhomes if needed on both ODAs in parallel.

Then, I check the version of the components with odacli describe-components. You may know that a reimage does not update the microcodes of the ODAs, you should be aware of that. If firmware/bios/ILOM are not up-to-date, you need to apply the patch on top of your deployment, even if the software side is OK. So copy the patch on both ODAs, and apply it. It will probably need a reboot or two.

Once done, I usually configure the number of cores with odacli update-cpucores to match the license. If my customer only has Standard Edition licenses, I also decrease the number of cores to make sure to benefit from maximum CPU speed. See why here.

Finally, I do a sanity check of the 2 ODAs, checking if everything is running fine.

At the end of this first day, 2 ODAs are ready to use.

2nd Day

I’m quite anxious about my ODAs not being exactly identical. So the next day, I deployed the other ODAs. For this particular project, it was a total of 6 ODAs with various disk and license configuration, but the deployment method is the same.

Based on what I did the previous day, I deployed quite easily the 4 other ODAs with their own json deployment file. Then I applied the patch, configured the license, and did the sanity checks.

At the end of this second day, my 6 ODAs were ready to use.

3rd day

2 ODAs came with hardware warnings, and most of the time these warning are false positive. So I did a check from the ILOM CLI, reset the alerts and restart the ILOM. That solved the problem.

As documentation is something quite important for me and for my customer, I spent the rest of the day on consolidating all information and provide a first version of the documentation to the customer.

4th Day

The fourth day was actually the next week. During this time, my customer created a first development database and put it on “production” without any problem.

With every new ODA software release, new features are available. And I think it’s worth it to test these features because it can bring you something very usefull.

This time, I was quite curious about the Data Guard feature now included with odacli. Data Guard manual configuration is always quite long, it’s not very difficult but a lot of steps are needed to achieve this kind of configuration. And you have to do it for each database, meaning that it can takes days to put all your databases under Data Guard protection.

So I proposed to my customer to test the Data Guard implementation with odacli. I created a test database dedicated for that purpose, and followed the online documentation. It took me all the day, but I was able to create the standby, to configure Data Guard, to do a switchover, to do the switchback and to make a clean and complete procedure for the customer. You need to do that because the documentation has 1 or 2 steps that need more accuracy, and 1 or 2 others that need to be adapt to the customer’s environment.

This new feature will definitely simplify the Data Guard configuration, please take a look at my blog post if you’d like to test it.

5th Day

The goal of this day was to make a procedure for configuring Data Guard between the old platform and the new one. An ACFS to ASM conversion was needed, as well. So we worked on that point, made a lot of tests and finally provide a procedure for most cases. A DUPLICATE DATABASE FOR STANDBY with a BACKUP LOCATION was used in that procedure.

This procedure is not ODA specific, most of the advanced operations on ODA are done using classic tools.

6th, 7th and 8th days

These days were dedicated to the ODA workshop. It’s the perfect timing for this training, because the customer already had quite a lot of information during the deployment, and the servers are not yet on production meaning that we can use them for demos and exercises. At dbi services, we make our own training material. Regarding the ODA workshop, it’s start from the history of ODA until the lifecycle management of the plaform. You need 2 to 3 days to have a good overview of the solution and to get ready to work on it.

9th and 10th days

These days were actually extra days in my opinion, extra days are not useless day. Most often, it’s a good addition because there’s always the need to go deeper for some themes.

This time, we need to refine the Data Guard procedure and test it with older versions: 12.2, 12.1 and 11.2. We discovered that it didn’t work for 11.2, and we tried to debug. Finally, we decided to use only the ODA Data Guard feature for 12.1 and later versions, it was OK because only a few databases will not be able to go higher than 11.2. We also found that configuring Data Guard from scratch only takes 40 minutes, including the backup/restore operations (for the smallest possible database), it definitely validated the efficiency of this method over manual configuration.

We also studied the best way to create additional listeners, because odacli does not include the listener management. Using srvctl to do that was quite convenient and clean, so we provided a procedure to configure these listeners, and we also tested the Data Guard feature with these dedicated listeners, it was OK.

Last task was to provide a procedure for ACFS to ASM migration. Once migrated to 19c, 11gR2 databases can be moved to ASM to get rid of ACFS (as ASM has been chosen by the customer for all 12c and later databases). odacli does not provide a mechanism to move from ACFS to ASM, but it’s possible to restore an ACFS database to ASM quite easily.

Actually, these 2 days were very efficient. And I also had enough time to send the v1.1 of the documentation with all the procedures we elaborated together with my customer.

The next days

For this project, my job was limited to deploying/configuring the ODAs and training my customer plus giving him the best practices. With a team of experimented DBAs, it will be quite easy now to continue without me, the next task being to migrate each database to this new infrastructure. Even on ODA, the migration process takes days because there is a lot of databases coming from different versions.

Conclusion

ODA is a great platform to optimize your work, starting from the migration project. You don’t loose time because this solution is ready to use. I don’t think it’s possible to be as efficient with another on-premise platform. For sure, you will save a lot of time, you will have less problems, you will manage everything by yourself, the performance will be good. ODA is probably the best option for achieving this kind of project with minimal risk and maximal efficiency.

Cet article A typical ODA project (and why I love Oracle Database Appliance) est apparu en premier sur Blog dbi services.

↧

JENKINS install Jenkins on Windows

December 9, 2020, 4:07 am

≫ Next: Oracle 21c : Create a New Database

≪ Previous: A typical ODA project (and why I love Oracle Database Appliance)

Hi Team,
Today let’s talk about Jenkins software

What is Jenkins?

Jenkins is an open source automation server that enables developers around the world to reliably build, test, and deploy their software.It is the leading open-source continuous integration server. Built with Java it can manage a huge amount of plugin which extend its capacity.
We will see how to install it on a Windows machine step by step.

Prerequisites

All information can be checked regarding your configuration and OS following the excellent site of Jenkins for example installation for Windows:

https://www.jenkins.io/doc/book/installing/windows/

Install via MSI installer

Go to the download link and launch MSI the installation
https://ftp.halifax.rwth-aachen.de/jenkins/windows/2.266/jenkins.msi
Launch Jenkins MSI

Click on next on the setup wizard

Select default folder to install Jenkins

Choose to install in local or logon a service account

You have to test your credential to continue if you choose local or domain user

Then, select the port for Jenkins and test it to validate. Click on next step to continue

Select the Java path that will be used (if you don’t have Java installed you will not be able to continue)

You can install Adoptopenjdk for example following this link
https://adoptopenjdk.net/releases.html?variant=openjdk11&jvmVariant=hotspot#x64_win
Or you can install Java on oracle’s site
https://java.com/en/download/

You can then select other feature installation :

Once you have configured Jenkins as needed click on install to start installation

Click on finish when it is done then a web page will pop-up

Wizard installation

The web page will ask you to Wait for Jenkins preparation

Unlock Jenkins

Get the password on installation log from following the path indicated below (note : you may need some privileges to access it)

In ‪C:\Program Files\Jenkins\jenkins.err.log

In JENKINS_HOME\secrets>initialAdminPassword

Choose plugin to install

You can now select if you want to install suggested bundle of plugins or if you want to download plugin you want

Let Jenkins load the plugins

Admin user creation

Create the first admin user by filling out the form on this screen, then click “Save and Finish” (note :you can directly continue as admin) You can create additional users after Jenkins is installed and running.

Select the Jenkins URL for login

After saving, you will have a screen displaying that Jenkins is ready, the dashboard can now be opened.
As we passed the first admin creating, we must connect with the login Admin and the password used for Jenkins installation
You can now access to the Jenkins dashboard, now we can start to play!! (note: you can create other admin users as I did on the screen)

Conclusion

Now you get all the steps to install Jenkins on a Windows machine, if you need a tutorial to install it on Linux feel free to ask me , next time we will check the differences between Jenkins and Jenkins X so stay tuned and don’t hesitate to follow the dbi bloggers

Cet article JENKINS install Jenkins on Windows est apparu en premier sur Blog dbi services.

↧

Oracle 21c : Create a New Database

December 9, 2020, 8:58 am

≫ Next: A Simple Repository Browser Utility

≪ Previous: JENKINS install Jenkins on Windows

Oracle 21c is now released on the cloud. And in this blog I am just testing my first database creation . As earlier release dbca is still present. Just launch it

[oracle@oraadserver admin]$ dbca

After The creation some query to verify

SQL> select comp_name,version,status from dba_registry;

COMP_NAME                                VERSION    STATUS
---------------------------------------- ---------- ----------
Oracle Database Catalog Views            21.0.0.0.0 VALID
Oracle Database Packages and Types       21.0.0.0.0 VALID
Oracle Real Application Clusters         21.0.0.0.0 OPTION OFF
JServer JAVA Virtual Machine             21.0.0.0.0 VALID
Oracle XDK                               21.0.0.0.0 VALID
Oracle Database Java Packages            21.0.0.0.0 VALID
OLAP Analytic Workspace                  21.0.0.0.0 VALID
Oracle XML Database                      21.0.0.0.0 VALID
Oracle Workspace Manager                 21.0.0.0.0 VALID
Oracle Text                              21.0.0.0.0 VALID
Oracle Multimedia                        21.0.0.0.0 VALID
Oracle OLAP API                          21.0.0.0.0 VALID
Spatial                                  21.0.0.0.0 VALID
Oracle Locator                           21.0.0.0.0 VALID
Oracle Label Security                    21.0.0.0.0 VALID
Oracle Database Vault                    21.0.0.0.0 VALID

16 rows selected.

In coming blogs we will see some new features. Just we will note that it is no longer possible to create a non-cdb instance since Oracle 20c

Cet article Oracle 21c : Create a New Database est apparu en premier sur Blog dbi services.

↧

A Simple Repository Browser Utility

December 9, 2020, 11:48 am

≫ Next: Oracle 21c Security : ORA_STIG_PROFILE and ORA_CIS_PROFILE

≪ Previous: Oracle 21c : Create a New Database

A few weeks ago, as the final steps of a cloning procedure, I wanted to check if the cloned repository was OK. One of the tests was to peek and poke around in the repository and try to access its content. This is typically the kind of task for which you’d use a GUI-based program because it is much quicker and easier this way rather than by sending manually typed commands to the server from within idql and iapi and transferring the contents to a desktop where a pdf reader, word processor and spreadsheet programs can be used to visualize them. Documentum Administrator (alias DA) is the tool we generally use for this purpose. It is a browser-based java application deployed on a web application server such as Oracle WebLogic (which is overkill just for DA) or tomcat. It also requires IE as the browser because DA needs to download an executable extension for Windows in order to enable certain functionalities. So, I had to download and install the full requirements’ stack to enable DA: an openjdk (several trials before the correct one, an OpenJDK v11, was found), tomcat, DA (twice, one was apparently crippled), configure and deploy DA (with a lots of confusing date errors which could relate to the cloning process but were not, after all), start my Windows VM (all 8 Gb of RAM of it), start IE (which I never use, and you shouldn’t either), point IE to the aws instance DA was installed in, download and install the extension when prompted to do so, all this only to notice that 1. content visualization still did not work and 2. its installation did not stick as it kept asking to download and install the extension over and over again. All this DA part took twice as long as the cloning process itself. All I wanted was to browse the repository, click on a few random files here and there to see if their content was reachable, and to do that I had to install several Gb of, dare I say ?, bloatware. “This is ridiculous”, I thought, there has to be a better way. And indeed there is.
I remembered a cute little python module I use sometimes, server.py. It embarks a web server and presents a navigable web interface to the file system directory it is started from. From there, one can click on a file link and the file is opened in the browser or by the right application if it is installed and the mime file association is correct; or click on a sub-directory link to enter it. Colleagues can also use the URL to come and fetch files from my machines if needed, a quick way to share files, albeit temporarily.

Starting the file server in the current directory:

Current directory’s listing:

As it is open source, its code is available here server.py.
The file operations per se, mainly calls to the os module, were very few and thin and so I decided to gave it a try, replacing them with calls to the repository through the module DctmApi.py (see blog here DctmAPI.py). The result, after resolving a few issues due to the way Documentum repositories implement the file system metaphor, was quite effective and is presented in this blog. Enjoy.

Installing the module

As the saying goes, The shoemaker’s Son Always Goes Barefoot, so no git hub here and you’ll have to download the module’s original code from the aforementioned site, rename it to original-server.py and patch it. The changes have been kept minimal so that the resulting patch file is small and manageable.
On my Linux box, the downloaded source had extraneous empty lines, which I removed with following one-liner:

$ gawk -v RS='\n\n' '{print}' original-server.py > tmp.py; mv tmp.py original-server.py

After that, save the following patch instructions into the file delta.patch:

623a624,625
> import DctmBrowser
> 
637a640,644
>     session = None 
> 
>     import re
>     split = re.compile('(.+?)\(([0-9a-f]{16})\)')
>     last = re.compile('(.+?)\(([0-9a-f]{16})\).?$')
666,667c673,674
<         f = None
         # now path is a tuple (current path, r_object_id)
>         if DctmBrowser.isdir(SimpleHTTPRequestHandler.session, path):
678,685c685,686
<             for index in "index.html", "index.htm":
<                 index = os.path.join(path, index)
<                 if os.path.exists(index):
<                     path = index
<                     break
<             else:
<                 return self.list_directory(path)
             return self.list_directory(path)
>         f = None
687c688
             f = DctmBrowser.docopen(SimpleHTTPRequestHandler.session, path[1], 'rb')
693c694
             self.send_header("Content-type", DctmBrowser.splitext(SimpleHTTPRequestHandler.session, path[1]))
709a711
>         path is a (r_folder_path, r_object_id) tuple;
712c714
             list = DctmBrowser.listdir(SimpleHTTPRequestHandler.session, path)
718c720
         list.sort(key=lambda a: a[0].lower())
721,722c723,726
<             displaypath = urllib.parse.unquote(self.path,
             if ("/" != self.path):
>                displaypath = "".join(i[0] for i in SimpleHTTPRequestHandler.split.findall(urllib.parse.unquote(self.path, errors='surrogatepass')))
>             else:
>                displaypath = "/"
724c728
             displaypath = urllib.parse.unquote(path[0])
727c731
         title = 'Repository listing for %s' % displaypath
734c738
<         r.append('\n<h1>%s</h1>' % title)
---
>         r.append('<h3>%s</h3>\n' % title)
736,737c740,745
<         for name in list:
         # add an .. for the parent folder;
>         if ("/" != path[0]):
>             linkname = "".join(i[0] + "(" + i[1] + ")" for i in SimpleHTTPRequestHandler.split.findall(urllib.parse.unquote(self.path, errors='surrogatepass'))[:-1]) or "/"
>             r.append('%s' % (urllib.parse.quote(linkname, errors='surrogatepass'), html.escape("..")))
>         for (name, r_object_id) in list:
>             fullname = os.path.join(path[0], name)
740c748
             if DctmBrowser.isdir(SimpleHTTPRequestHandler.session, (name, r_object_id)):
742,749c750,751
<                 linkname = name + "/"
<             if os.path.islink(fullname):
<                 displayname = name + "@"
<                 # Note: a link to a directory displays with @ and links with /
<             r.append('

' < % (urllib.parse.quote(linkname, < errors='surrogatepass'), linkname = name + "(" + r_object_id + ")" + "/" > r.append('

' % (urllib.parse.quote(linkname, errors='surrogatepass'), html.escape(displayname))) 762,767c764 < """Translate a /-separated PATH to the local filename syntax. < < Components that mean special things to the local file system < (e.g. drive or directory names) are ignored. (XXX They should < probably be diagnosed.) """Extracts the path and r_object_id parts of a path formatted thusly: /....(r_object_id){/....(r_object_id)} 768a766,768 > if "/" == path: > return (path, None) > 773d772 < trailing_slash = path.rstrip().endswith('/') 781c780 path = "/" 787,789c786,787 < if trailing_slash: < path += '/' (path, r_object_id) = SimpleHTTPRequestHandler.last.findall(path)[0] > return (path, r_object_id) 807,840d804 < def guess_type(self, path): < """Guess the type of a file. < < Argument is a PATH (a filename). < < Return value is a string of the form type/subtype, < usable for a MIME Content-type header. < < The default implementation looks the file's extension < up in the table self.extensions_map, using application/octet-stream < as a default; however it would be permissible (if < slow) to look inside the data to make a better guess. < < """ < < base, ext = posixpath.splitext(path) < if ext in self.extensions_map: < return self.extensions_map[ext] < ext = ext.lower() < if ext in self.extensions_map: < return self.extensions_map[ext] < else: < return self.extensions_map[''] < < if not mimetypes.inited: < mimetypes.init() # try to read system mime.types < extensions_map = mimetypes.types_map.copy() < extensions_map.update({ < '': 'application/octet-stream', # Default < '.py': 'text/plain', < '.c': 'text/plain', < '.h': 'text/plain', < }) 1175c1140 ServerClass=HTTPServer, protocol="HTTP/1.0", port=8000, bind="", session = None): 1183a1149 > HandlerClass.session = session 1212d1177 <

Apply the patch using the following command:

$ patch -n original-server.py delta.patch -o server.py

server.py is the patched module with the repository access operations replacing the file system access ones.
As the command-line needs some more parameters for the connectivity to the repository, an updated main block has been added to parse them and moved into the new executable browser_repo.py. Here it is:

import argparse
import server
import textwrap
import DctmAPI
import DctmBrowser

if __name__ == '__main__':
    parser = argparse.ArgumentParser(
       formatter_class=argparse.RawDescriptionHelpFormatter,
       description = textwrap.dedent("""\
A web page to navigate a docbase's cabinets & folders.
Based on Aukasz Langa python server.py's module https://hg.python.org/cpython/file/3.5/Lib/http/server.py
cec at dbi-services.com, December 2020, integration with Documentum repositories;
"""))
    parser.add_argument('--bind', '-b', default='', metavar='ADDRESS',
                        help='Specify alternate bind address [default: all interfaces]')
    parser.add_argument('--port', action='store',
                        default=8000, type=int,
                        nargs='?',
                        help='Specify alternate port [default: 8000]')
    parser.add_argument('-d', '--docbase', action='store',
                        default='dmtest73', type=str,
                        nargs='?',
                        help='repository name [default: dmtest73]')
    parser.add_argument('-u', '--user_name', action='store',
                        default='dmadmin',
                        nargs='?',
                        help='user name [default: dmadmin]')
    parser.add_argument('-p', '--password', action='store',
                        default='dmadmin',
                        nargs='?',
                        help=' user password [default: "dmadmin"]')
    args = parser.parse_args()

    # Documentum initialization and connecting here;
    DctmAPI.logLevel = 1

    # not really needed as it is done in the module itself;
    status = DctmAPI.dmInit()
    if status:
       print("dmInit() was successful")
    else:
       print("dmInit() was not successful, exiting ...")
       sys.exit(1)

    session = DctmAPI.connect(args.docbase, args.user_name, args.password)
    if session is None:
       print("no session opened in docbase %s as user %s, exiting ..." % (args.docbase, args.user_name))
       exit(1)

    try:
       server.test(HandlerClass=server.SimpleHTTPRequestHandler, port=args.port, bind=args.bind, session = session)
    finally:
       print("disconnecting from repository")
       DctmAPI.disconnect(session)

Save it into file browser_repo.py. This is the new main program.
Finally, helper functions have been added to interface the main program to the module DctmAPI:

#
# new help functions for browser_repo.py;
#

import DctmAPI

def isdir(session, path):
   """
   return True if path is a folder, False otherwise;
   path is a tuple (r_folder_path, r_object_id);
   """
   if "/" == path[0]:
      return True
   else:
      id = DctmAPI.dmAPIGet("retrieve, " + session + ",dm_folder where r_object_id = '" + path[1] + "'")
   return id

def listdir(session, path):
   """
   return a tuple of objects, folders or documents with their r_object_id, in folder path[0];
   path is a tuple (r_folder_path, r_object_id);
   """
   result = []
   if path[0] in ("/", ""):
      DctmAPI.select2dict(session, "select object_name, r_object_id from dm_cabinet", result)
   else:
      DctmAPI.select2dict(session, "select object_name, r_object_id from dm_document where folder(ID('" + path[1] + "')) UNION select object_name, r_object_id from dm_folder where folder(ID('" + path[1] + "'))", result)
   return [[doc["object_name"], doc["r_object_id"]] for doc in result]

def docopen(session, r_object_id, mode):
   """
   returns a file handle on the document with id r_object_id downloaded from its repository to the temporary location and opened;
   """
   temp_storage = '/tmp/'
   if DctmAPI.dmAPIGet("getfile," + session + "," + r_object_id + "," + temp_storage + r_object_id):
      return open(temp_storage + r_object_id, mode)
   else:
      raise OSError

def splitext(session, r_object_id):
   """
   returns the mime type as defined in dm_format for the document with id r_object_id;
   """
   result = []
   DctmAPI.select2dict(session, "select mime_type from dm_format where r_object_id in (select format from dmr_content c, dm_document d where any c.parent_id = d.r_object_id and d.r_object_id = '" + r_object_id + "')", result)
   return result[0]["mime_type"] if result else ""

Save this code into the file DctmBrowser.py.
To summarize, we have:
1. the original module original_server.py to be downloaded from the web
2. delta.patch, the diff file used to patch original_server.py into file server.py
3. DctmAPI.py, the python interface to Documentum, to be fetched from the provided link to a past blog
4. helper functions in module DctmBrowser.py
5. and finally the main executable browser_repo.py
Admittedly, a git repository would be nice here, maybe one day …
Use the command below to get the program’s help screen:

$ python browser_repo.py --help                        
usage: browser_repo.py [-h] [--bind ADDRESS] [--port [PORT]] [-d [DOCBASE]]
                      [-u [USER_NAME]] [-p [PASSWORD]]

A web page to navigate a docbase's cabinets & folders.
Based on Aukasz Langa python server.py's module https://hg.python.org/cpython/file/3.5/Lib/http/server.py
cec at dbi-services.com, December 2020, integration with Documentum repositories;

optional arguments:
  -h, --help            show this help message and exit
  --bind ADDRESS, -b ADDRESS
                        Specify alternate bind address [default: all
                        interfaces]
  --port [PORT]         Specify alternate port [default: 8000]
  -d [DOCBASE], --docbase [DOCBASE]
                        repository name [default: dmtest73]
  -u [USER_NAME], --user_name [USER_NAME]
                        user name [default: dmadmin]
  -p [PASSWORD], --password [PASSWORD]
                        user password [default: "dmadmin"]

Thus, the command below will launch the server on port 9000 with a session opened in repository dmtest73 as user dmadmin with password dmadmin:

$ python browse_repo.py --port 9000 -d dmtest73 -u dmadmin -p dmadmin

If you prefer long name options, use the alternative below:

$ python browser_repo.py --port 9000 --docbase dmtest73 --user_name dmadmin --password dmadmin

Start your favorite browser, any browser, just as God intended it in the first place, and point it to the host where you started the program with the specified port, e.g. http://192.168.56.10:9000/:

You are gratified with a very spartan, yet effective, view on the repository’s cabinets. Congratulations, you did it !

Moving around in the repository

As there is no root directory in a repository, the empty path or “/” are interpreted as a request to display a list of all the cabinets; each cabinet is a directory’s tree root. The program displays dm_folders and dm_cabinets (which are sub-types of dm_folder after all), and dm_document. Folders have a trailing slash to identify them, whereas documents have none. There are many other objects in repositories’ folders and I chose not to display them because I did not need to but this can be changed on lines 25 and 27 in the helper module DctmBrowser.py by specifying a different doctype, e.g. the super-type dm_sysobject instead.
An addition to the original server module is the .. link to the parent folder; I think it is easier to use it rather than the browser’s back button or right click/back arrow, but those are still usable since the program is stateless. Actually, a starting page could even be specified manually in the starting URL if it weren’t for its unusual format. In effect, the folders components and documents’ full path in URLs and html links are suffixed with a parenthesized r_object_id, e.g.:

http://192.168.56.10:9000/System(0c00c35080000106)/Sysadmin(0b00c3508000034e)/Reports(0b00c35080000350)/
-- or, url-encoded:
http://192.168.56.10:9000/System%280c00c35080000106%29/Sysadmin%280b00c3508000034e%29/Reports%280b00c35080000350%29/

This looks ugly but it allows to solve 2 issues specific to repositories:
1. Document names are not unique in the same folder but are on the par with any other document’s attribute. Consequently, a folder can quietly contains hundreds of identically named documents without any name conflict. In effect, what tells two documents apart is their unique r_object_id attribute and that is the reason why it is appended to the links and URLs. This is not a big deal because this potentially annoying technical information is not displayed in the web page but is only visible while hovering over links and in the browser’s address bar.
2. Document names can contain any character, even “/” and “:”. So, given a document’s full path name, how to parse it and separate the parent folder from the document’s name so it can be reached ? There is no generic, unambiguous way to do that. With the appended document’s unique r_object_id, it is a simple matter to extract the id from the full path and Bob’s your uncle (RIP Jerry P.).
Both above specificities make it impossible to access a document through its full path name, therefore the documents’ ids must be carried around; for folder, it is not necessary but it has been done in order to have an uniform format. As a side-effect, database performance is also possibly better.
If the program is started with no stdout redirection, log messages are visible on the screen, e.g.:

dmadmin@dmclient:~/dctm-webserver$ python browser_repo.py --port 9000 --docbase dmtest73 --user_name dmadmin --password dmadmin 
dmInit() was successful
Serving HTTP on 0.0.0.0 port 9000 ...
192.168.56.1 - - [05/Dec/2020 22:57:00] "GET / HTTP/1.1" 200 -
192.168.56.1 - - [05/Dec/2020 22:57:03] "GET /System%280c00c35080000106%29/ HTTP/1.1" 200 -
192.168.56.1 - - [05/Dec/2020 22:57:07] "GET /System%280c00c35080000106%29/Sysadmin%280b00c3508000034e%29/ HTTP/1.1" 200 -
192.168.56.1 - - [05/Dec/2020 22:57:09] "GET /System%280c00c35080000106%29/Sysadmin%280b00c3508000034e%29/Reports%280b00c35080000350%29/ HTTP/1.1" 200 -
192.168.56.1 - - [05/Dec/2020 22:57:14] "GET /System%280c00c35080000106%29/Sysadmin%280b00c3508000034e%29/Reports%280b00c35080000350%29/ConsistencyChecker%280900c3508000211e%29/ HTTP/1.1" 200 -
192.168.56.1 - - [05/Dec/2020 22:57:22] "GET /System%280c00c35080000106%29/Sysadmin%280b00c3508000034e%29/Reports%280b00c35080000350%29/StateOfDocbase%280900c35080002950%29/ HTTP/1.1" 200 -
192.168.56.1 - - [05/Dec/2020 22:57:27] "GET /System%280c00c35080000106%29/Sysadmin%280b00c3508000034e%29/ HTTP/1.1" 200 -
...

The logged information and format are quite standard for web servers, one log line per request, beginning with the client’s ip address, the timestamp, request type (there will be only GETs as the utility is read-only) and resource, and the returned http status code.
If the variable DctmAPI.logLevel is set to True (or 1 or an non-empty string or collection, as python interprets them all as the boolean True) in the main program, API statements and messages from the repository are logged to stdout too, which can help if troubleshooting is needed, e.g.:

dmadmin@dmclient:~/dctm-webserver$ python browser_repo.py --port 9000 --docbase dmtest73 --user_name dmadmin --password dmadmin 
'in dmInit()' 
"dm= after loading library libdmcl.so" 
'exiting dmInit()' 
dmInit() was successful
'in connect(), docbase = dmtest73, user_name = dmadmin, password = dmadmin' 
'successful session s0' 
'[DM_SESSION_I_SESSION_START]info:  "Session 0100c35080002e3d started for user dmadmin."' 
'exiting connect()' 
Serving HTTP on 0.0.0.0 port 9000 ...
'in select2dict(), dql_stmt=select object_name, r_object_id from dm_cabinet' 
192.168.56.1 - - [05/Dec/2020 23:02:59] "GET / HTTP/1.1" 200 -
"in select2dict(), dql_stmt=select object_name, r_object_id from dm_document where folder(ID('0c00c35080000106')) UNION select object_name, r_object_id from dm_folder where folder(ID('0c00c35080000106'))" 
192.168.56.1 - - [05/Dec/2020 23:03:03] "GET /System%280c00c35080000106%29/ HTTP/1.1" 200 -
"in select2dict(), dql_stmt=select object_name, r_object_id from dm_document where folder(ID('0b00c3508000034e')) UNION select object_name, r_object_id from dm_folder where folder(ID('0b00c3508000034e'))" 
192.168.56.1 - - [05/Dec/2020 23:03:05] "GET /System%280c00c35080000106%29/Sysadmin%280b00c3508000034e%29/ HTTP/1.1" 200 -
"in select2dict(), dql_stmt=select object_name, r_object_id from dm_document where folder(ID('0b00c35080000350')) UNION select object_name, r_object_id from dm_folder where folder(ID('0b00c35080000350'))" 
192.168.56.1 - - [05/Dec/2020 23:03:10] "GET /System%280c00c35080000106%29/Sysadmin%280b00c3508000034e%29/Reports%280b00c35080000350%29/ HTTP/1.1" 200 -
"in select2dict(), dql_stmt=select mime_type from dm_format where r_object_id in (select format from dmr_content c, dm_document d where any c.parent_id = d.r_object_id and d.r_object_id = '0900c3508000211e')" 
192.168.56.1 - - [05/Dec/2020 23:03:11] "GET /System%280c00c35080000106%29/Sysadmin%280b00c3508000034e%29/Reports%280b00c35080000350%29/ConsistencyChecker%280900c3508000211e%29/ HTTP/1.1" 200 -

Feel free to initialize that variable from the command-line if you prefer.
A nice touch in the original module is that execution errors are trapped in an exception handler so the program does not need to be restarted in case of failure. As it is stateless, errors have no effect on subsequent requests.
Several views on the same repositories can be obtained by starting several instances of the program at once with different listening ports. Similarly, if one feels the urge to explore several repositories at once, just start as many modules as needed with different listening ports and appropriate credentials.
To exit the program, just type ctrl-c; no data will be lost here as the program just browses repositories in read-only mode.

A few comments on the customizations

Lines 8 and 9 in the diff above introduce the regular expressions that will be used later to extract the path component/r_object_id couples from the URL’s path part; “split” is for one such tuple anywhere in the path and “last” is for the last one and is aimed at getting the r_object_id of the folder that is clicked on from its full path name. python’s re module allows to pre-compile them for efficiency. Note the .+? syntax to specify a non-greedy regular expression.
On line 13, the function isdir() is now implemented in the module DctmBrowser and returns True if the clicked item is a folder.
Similarly, line 25 calls a reimplementation of os.open() in module DctmBrowser that exports locally the clicked document’s content to /tmp, opens it and returns the file handle; this will allow the content to be sent to the browser for visualization.
Line 31 calls a reimplementation of os.listdir() to list the content of the clicked repository folder.
Line 37 applies the “split” regular expression to the current folder path to extract its tuple components (returned in an array of sub-path/r_object_id couples) and then concatenating the sub-paths together to get the current folder to be displayed later. More concretely, it allows to go from
/System(0c00c35080000106)/Sysadmin(0b00c3508000034e)/Reports(0b00c35080000350)/
to
/System/Sysadmin/Reports
which is displayed in the html page’s title.
The conciseness of the expression passed to the join() is admirable; lots of programming mistakes and low-level verbosity is prevented thanks to python’s list comprehensions.
Similarly, on line 52, the current folder’s parent folder is computed from the current path.
On line 86, the second regular expression, “last”, is applied to extract the r_object_id of the current folder (i.e. the one that is clicked on).
Line 89 to 121 were removed from the original module because mime processing is much simplified as the repository maintains a list of mime formats (table dm_format) and the selected document’s mime type can be found by just looking up that table, see function splitext() in module DctmBrowser, called on line 27. By returning to it a valid mime type, the browser can cleverly process the content, i.e. display the supported content types (such as text) and prompt for some other action if not (e.g. office documents).
One line 126, the session id is passed to class SimpleHTTPRequestHandler and stored as a class variable; later it is referenced as SimpleHTTPRequestHandler.session in the class but self.session would work too, although I prefer the former syntax as it makes clear that session does not depend on the instantiations of the class; the session is valid for any such instantiations. As the program connects to only one repository at startup time, no need to make session an instance variable.
The module DctmBrowser is used as a bridge between the module DctmAPI and the main program browser_repo.py. This is were most of the repository stuff is done. As it is blatant here, not much is needed to go from listing directories and files from a filesystem to listing folders and documents from a repository.

Security

As showed by the usage message above (option ––help), a bind address can be specified. By default, the embedded web server listens on all the machine’s network interfaces and, as there is not identification against the web server, another machine on the same network could reach the web server on that machine and access the repository through the opened session, if there is no firewall in the way. To prevent this, just specify the loopback IP adress, 127.0.0.1 or localhost:

dmadmin@dmclient:~/dctm-webserver$ python browser_repo.py --bind 127.0.0.1 --port 9000 --docbase dmtest73 --user_name dmadmin --password dmadmin 
...
Serving HTTP on 127.0.0.1 port 9000 ...

# testing locally (no GUI on server, using wget):
dmadmin@dmclient:~/dctm-webserver$ wget 127.0.0.1:9000
--2020-12-05 22:06:02--  http://127.0.0.1:9000/
Connecting to 127.0.0.1:9000... connected.
HTTP request sent, awaiting response... 200 OK
Length: 831 
Saving to: 'index.html'

index.html                                           100%[=====================================================================================================================>]     831  --.-KB/s    in 0s      

2020-12-05 22:06:03 (7.34 MB/s) - 'index.html' saved [831/831]

dmadmin@dmclient:~/dctm-webserver$ cat index.html 
Repository listing for /

Repository listing for /

In addition, as the web server carries the client’s IP address (variable self.address_string), some more finely tuned address restriction could also be implemented by filtering out unwelcome clients and letting in authorized ones.
Presently, the original module does not support https and hence the network traffic between clients and server is left unencrypted. However, one could imagine to install a small nginx or apache web server as a front on the same machine, setup security at their level and insert a redirection to the python module listening on localhost with the http protocol, a quick and easy solution that does not required any change in the code, although that would be way out of scope of the module, whose primary goal is to serve requests from the same machine it is running on. Note that if we’re starting talking about adding another web server, we could as well move all the repository browsing code into a separate (Fast)CGI python program directly invoked by the web server and make it available to any allowed networked users as a full blown service complete with authentication and access rights.

Conclusion

This tool is really a nice utility for browsing repositories, especially those running a Unix/linux machines because most of the time the servers are headless and have no native applications installed. The tool interfaces any browser, running on any O/S or device, with such repositories and alleviate the usual burden of executing getfile API statements and scp commands to transfer the contents to the desktop for visualization. For this precise functionality, it is even better than dqman, at least for browsing and visualizing browser-readable contents.
There is a lot of room for improvement if one would like a full repository browser, e.g. to display the metadata as well. In addition, if needed, the original module’s functionality, browsing the local sub-directory tree, could be reestablished as it is not incompatible with repositories.
The tool also proves again that the approach of picking an existing tool that implements most of the requirements, and customizing it to a specific need is quite an very effective one.

Cet article A Simple Repository Browser Utility est apparu en premier sur Blog dbi services.

↧

Oracle 21c Security : ORA_STIG_PROFILE and ORA_CIS_PROFILE

December 10, 2020, 4:14 am

≫ Next: Oracle 21c Security : Gradual Database Password Rollover

≪ Previous: A Simple Repository Browser Utility

In my previous blog I was testing the creation of a new Oracle 21c database. In this blog I am talking about two changes about the security.
In each new release Oracle strengthens security. That’s why since Oracle 12.2, to meet Security Technical Implementation Guides (STIG) compliance, Oracle Database provided the profile ORA_STIG_PROFILE
With Oracle 21c the profile ORA_STIG_PROFILE was updated and Oracle has provided a new profile to meet CIS standard : the profile ORA_CIS_PROFILE
The ORA_STIG_PROFILE user profile has been updated with the latest Security Technical Implementation Guide’s (STIG) guidelines
The ORA_CIS_PROFILE has the latest Center for Internet Security (CIS) guidelines

ORA_STIG_PROFILE
In an Oracle 19c database, we can fine following for the ORA_STIG_PROFILE.

SQL> select profile,resource_name,limit from dba_profiles where profile='ORA_STIG_PROFILE' order by resource_name;

PROFILE                        RESOURCE_NAME                  LIMIT
------------------------------ ------------------------------ ------------------------------
ORA_STIG_PROFILE               COMPOSITE_LIMIT                DEFAULT
ORA_STIG_PROFILE               CONNECT_TIME                   DEFAULT
ORA_STIG_PROFILE               CPU_PER_CALL                   DEFAULT
ORA_STIG_PROFILE               CPU_PER_SESSION                DEFAULT
ORA_STIG_PROFILE               FAILED_LOGIN_ATTEMPTS          3
ORA_STIG_PROFILE               IDLE_TIME                      15
ORA_STIG_PROFILE               INACTIVE_ACCOUNT_TIME          35
ORA_STIG_PROFILE               LOGICAL_READS_PER_CALL         DEFAULT
ORA_STIG_PROFILE               LOGICAL_READS_PER_SESSION      DEFAULT
ORA_STIG_PROFILE               PASSWORD_GRACE_TIME            5
ORA_STIG_PROFILE               PASSWORD_LIFE_TIME             60
ORA_STIG_PROFILE               PASSWORD_LOCK_TIME             UNLIMITED
ORA_STIG_PROFILE               PASSWORD_REUSE_MAX             10
ORA_STIG_PROFILE               PASSWORD_REUSE_TIME            365
ORA_STIG_PROFILE               PASSWORD_VERIFY_FUNCTION       ORA12C_STIG_VERIFY_FUNCTION
ORA_STIG_PROFILE               PRIVATE_SGA                    DEFAULT
ORA_STIG_PROFILE               SESSIONS_PER_USER              DEFAULT

17 rows selected.

SQL>

Now in in Oracle 21c, we can see that there are some changes.

SQL> select profile,resource_name,limit from dba_profiles where profile='ORA_STIG_PROFILE' order by RESOURCE_NAME;

PROFILE                        RESOURCE_NAME                  LIMIT
------------------------------ ------------------------------ ------------------------------
ORA_STIG_PROFILE               COMPOSITE_LIMIT                DEFAULT
ORA_STIG_PROFILE               CONNECT_TIME                   DEFAULT
ORA_STIG_PROFILE               CPU_PER_CALL                   DEFAULT
ORA_STIG_PROFILE               CPU_PER_SESSION                DEFAULT
ORA_STIG_PROFILE               FAILED_LOGIN_ATTEMPTS          3
ORA_STIG_PROFILE               IDLE_TIME                      15
ORA_STIG_PROFILE               INACTIVE_ACCOUNT_TIME          35
ORA_STIG_PROFILE               LOGICAL_READS_PER_CALL         DEFAULT
ORA_STIG_PROFILE               LOGICAL_READS_PER_SESSION      DEFAULT
ORA_STIG_PROFILE               PASSWORD_GRACE_TIME            0
ORA_STIG_PROFILE               PASSWORD_LIFE_TIME             35
ORA_STIG_PROFILE               PASSWORD_LOCK_TIME             UNLIMITED
ORA_STIG_PROFILE               PASSWORD_REUSE_MAX             5
ORA_STIG_PROFILE               PASSWORD_REUSE_TIME            175
ORA_STIG_PROFILE               PASSWORD_ROLLOVER_TIME         DEFAULT
ORA_STIG_PROFILE               PASSWORD_VERIFY_FUNCTION       ORA12C_STIG_VERIFY_FUNCTION
ORA_STIG_PROFILE               PRIVATE_SGA                    DEFAULT
ORA_STIG_PROFILE               SESSIONS_PER_USER              DEFAULT

18 rows selected.

SQL>

The following parameters were updated

-PASSWORD_GRACE_TIME
-PASSWORD_LIFE_TIME
-PASSWORD_REUSE_MAX
-PASSWORD_REUSE_TIME
-And there is a new parameter PASSWORD_ROLLOVER_TIME

ORA_CIS_PROFILE
Below the new characteristics for the new profile

SQL> select profile,resource_name,limit from dba_profiles where profile='ORA_CIS_PROFILE' order by RESOURCE_NAME;

PROFILE                        RESOURCE_NAME                  LIMIT
------------------------------ ------------------------------ ------------------------------
ORA_CIS_PROFILE                COMPOSITE_LIMIT                DEFAULT
ORA_CIS_PROFILE                CONNECT_TIME                   DEFAULT
ORA_CIS_PROFILE                CPU_PER_CALL                   DEFAULT
ORA_CIS_PROFILE                CPU_PER_SESSION                DEFAULT
ORA_CIS_PROFILE                FAILED_LOGIN_ATTEMPTS          5
ORA_CIS_PROFILE                IDLE_TIME                      DEFAULT
ORA_CIS_PROFILE                INACTIVE_ACCOUNT_TIME          120
ORA_CIS_PROFILE                LOGICAL_READS_PER_CALL         DEFAULT
ORA_CIS_PROFILE                LOGICAL_READS_PER_SESSION      DEFAULT
ORA_CIS_PROFILE                PASSWORD_GRACE_TIME            5
ORA_CIS_PROFILE                PASSWORD_LIFE_TIME             90
ORA_CIS_PROFILE                PASSWORD_LOCK_TIME             1
ORA_CIS_PROFILE                PASSWORD_REUSE_MAX             20
ORA_CIS_PROFILE                PASSWORD_REUSE_TIME            365
ORA_CIS_PROFILE                PASSWORD_ROLLOVER_TIME         DEFAULT
ORA_CIS_PROFILE                PASSWORD_VERIFY_FUNCTION       ORA12C_VERIFY_FUNCTION
ORA_CIS_PROFILE                PRIVATE_SGA                    DEFAULT
ORA_CIS_PROFILE                SESSIONS_PER_USER              10

18 rows selected.

SQL>

These user profiles can be directly used with the database users or as part of your own user profiles. Oracle keeps these profiles up to date to make it easier for you to implement password policies that meet STIG and CIS guidelines.

Cet article Oracle 21c Security : ORA_STIG_PROFILE and ORA_CIS_PROFILE est apparu en premier sur Blog dbi services.

↧

Oracle 21c Security : Gradual Database Password Rollover

December 10, 2020, 4:14 am

≫ Next: Easy failover and switchover with pg_auto_failover

≪ Previous: Oracle 21c Security : ORA_STIG_PROFILE and ORA_CIS_PROFILE

Starting with Oracle 21c, a password of an application can be changed without having to schedule a downtime. This can be done by using the new profile parameter PASSWORD_ROLLOVER_TIME
This will set a rollover period of time where the application can log in using either the old password or the new password. With this enhancement, an administrator does not need any more to take the application down when the application database password is being rotated.
Let see in this blog how this works

SQL> show pdbs

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB1                           READ WRITE NO
SQL>

First we create a profile in PDB1

SQL> show con_name;

CON_NAME
------------------------------
PDB1


SQL> CREATE PROFILE testgradualrollover LIMIT
 FAILED_LOGIN_ATTEMPTS 4
 PASSWORD_ROLLOVER_TIME 4;  

Profile created.

SQL>

Note that the parameter PASSWORD_ROLLOVER_TIME is specified in days. For example, 1/24 means 1H.
The minimum value for this parameter is 1h and the maximum value is 60 days or the lower value of the PASSWORD_LIFE_TIME or PASSWORD_GRACE_TIME parameter.
Now let’s create a new user in PDB1 and let’s assign him the profile we created

SQL> create user edge identified by "Borftg8957##"  profile testgradualrollover;

User created.

SQL> grant create session to edge;

Grant succeeded.

SQL>

We can also verify the status of the account in the PDB

SQL>  select username,account_status from dba_users where username='EDGE';

USERNAME             ACCOUNT_STATUS
-------------------- --------------------
EDGE                 OPEN

SQL>

Now let’s log with new user


[oracle@oraadserver admin]$ sqlplus edge/"Borftg8957##"@pdb1

SQL*Plus: Release 21.0.0.0.0 - Production on Thu Dec 10 11:14:07 2020
Version 21.1.0.0.0

Copyright (c) 1982, 2020, Oracle.  All rights reserved.


Connected to:
Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 - Production
Version 21.1.0.0.0

SQL> show con_name;

CON_NAME
------------------------------
PDB1
SQL> show user;
USER is "EDGE"
SQL>

Now let’s change the password of the user edge

SQL> alter user edge identified by "Morfgt5879!!";

User altered.

SQL>

As the rollover period is set to 4 days in the profile testgradualrollover, the user edge should be able to connect during 4 days with either the old password or the new one.
Let’s test with the old password

[oracle@oraadserver admin]$ sqlplus edge/"Borftg8957##"@pdb1

SQL*Plus: Release 21.0.0.0.0 - Production on Thu Dec 10 11:21:02 2020
Version 21.1.0.0.0

Copyright (c) 1982, 2020, Oracle.  All rights reserved.

Last Successful login time: Thu Dec 10 2020 11:14:07 +01:00

Connected to:
Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 - Production
Version 21.1.0.0.0

SQL> show con_name;

CON_NAME
------------------------------
PDB1
SQL> show user;
USER is "EDGE"
SQL>

Let’s test with the new password

[oracle@oraadserver ~]$ sqlplus edge/'Morfgt5879!!'@pdb1

SQL*Plus: Release 21.0.0.0.0 - Production on Thu Dec 10 11:24:52 2020
Version 21.1.0.0.0

Copyright (c) 1982, 2020, Oracle.  All rights reserved.

Last Successful login time: Thu Dec 10 2020 11:21:02 +01:00

Connected to:
Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 - Production
Version 21.1.0.0.0

SQL> show user;
USER is "EDGE"

SQL> show con_name;

CON_NAME
------------------------------
PDB1
SQL>

We can see that the connection is successfully done with both cases. If we query the dba_users we can see the status of the rollover

SQL> select username,account_status from dba_users where username='EDGE';

USERNAME             ACCOUNT_STATUS
-------------------- --------------------
EDGE                 OPEN & IN ROLLOVER

To end the password rollover period
-Let the password rollover expire on its own
-As either the user or an administrator run the command

    Alter user edge expire password rollover period;

-As an administrator, expire the user password

Alter user edge password expire;

Database behavior during the gradual password rollover period can be found here in the documentation

Cet article Oracle 21c Security : Gradual Database Password Rollover est apparu en premier sur Blog dbi services.

↧

Easy failover and switchover with pg_auto_failover

December 10, 2020, 6:28 am

≫ Next: Recovery in the ☁ with Google Cloud SQL (PostgreSQL)

≪ Previous: Oracle 21c Security : Gradual Database Password Rollover

One the really cool things with PostgreSQL is, that you have plenty of choices when it comes to tooling. For high availability we usually go with Patroni, but there is also pg_auto_failover and this will be the topic of this post. Because of the recent announcement around CentOS we’ll go with Debian this time. What is already prepared is the PostgreSQL installation (version 13.1), but nothing else. We start from scratch to see, if “is optimized for simplicity and correctness”, as it is stated on the GitHub page holds true.

This is the setup we’ll start with:

Hostname	IP-Address	Initial role
pgaf1.it.dbi-services.com	192.168.22.190	Primary and pg_auto_failover monitor
pgaf2.it.dbi-services.com	192.168.22.191	First replica
pgaf3.it.dbi-services.com	192.168.22.192	Second replica

As said above, all three nodes have PostgreSQL 13.1 already installed at this location (PostgreSQL was installed from source code, but that should not really matter):

postgres@pgaf1:~$ ls /u01/app/postgres/product/13/db_1/
bin  include  lib  share

What I did in addition, is to create ssh keys and then copy those from each machine to all nodes so password-less ssh connections are available between the nodes (here is the example from the first node):

postgres@pgaf1:~$ ssh-keygen
postgres@pgaf1:~$ ssh-copy-id postgres@pgaf1
postgres@pgaf1:~$ ssh-copy-id postgres@pgaf2
postgres@pgaf1:~$ ssh-copy-id postgres@pgaf3

For installing pg_auto_failover from source make sure that pg_config is in your path:

postgres@pgaf1:~$ which pg_config
/u01/app/postgres/product/13/db_1//bin/pg_config

Once that is ready, getting pg_auto_failover installed is quite simple:

postgres@pgaf1:~$ git clone https://github.com/citusdata/pg_auto_failover.git
Cloning into 'pg_auto_failover'...
remote: Enumerating objects: 252, done.
remote: Counting objects: 100% (252/252), done.
remote: Compressing objects: 100% (137/137), done.
remote: Total 8131 (delta 134), reused 174 (delta 115), pack-reused 7879
Receiving objects: 100% (8131/8131), 5.07 MiB | 1.25 MiB/s, done.
Resolving deltas: 100% (6022/6022), done.
postgres@pgaf1:~$ cd pg_auto_failover/
postgres@pgaf1:~$ make
make -C src/monitor/ all
make[1]: Entering directory '/home/postgres/pg_auto_failover/src/monitor'
gcc -std=c99 -D_GNU_SOURCE -g -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wimplicit-fallthrough=3 -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-format-truncation -Wno-stringop-truncation -O2 -Wformat -Wall -Werror=implicit-int -Werror=implicit-function-declaration -Werror=return-type -Wno-declaration-after-statement -Wno-missing-braces  -fPIC -std=c99 -Wall -Werror -Wno-unused-parameter -Iinclude -I/u01/app/postgres/product/13/db_1/include -g -I. -I./ -I/u01/app/postgres/product/13/db_1/include/server -I/u01/app/postgres/product/13/db_1/include/internal  -D_GNU_SOURCE -I/usr/include/libxml2   -c -o metadata.o metadata.c
...
make[2]: Leaving directory '/home/postgres/pg_auto_failover/src/bin/pg_autoctl'
make[1]: Leaving directory '/home/postgres/pg_auto_failover/src/bin'
postgres@pgaf1:~$ make install
make -C src/monitor/ all
make[1]: Entering directory '/home/postgres/pg_auto_failover/src/monitor'
make[1]: Nothing to be done for 'all'.
...

This needs to be done on all hosts, of course. You will notice a new extension and new binaries in your PostgreSQL installation:

postgres@pgaf1:~$ ls /u01/app/postgres/product/13/db_1/share/extension/*pgauto*
/u01/app/postgres/product/13/db_1/share/extension/pgautofailover--1.0--1.1.sql
/u01/app/postgres/product/13/db_1/share/extension/pgautofailover--1.0.sql
/u01/app/postgres/product/13/db_1/share/extension/pgautofailover--1.1--1.2.sql
/u01/app/postgres/product/13/db_1/share/extension/pgautofailover--1.2--1.3.sql
/u01/app/postgres/product/13/db_1/share/extension/pgautofailover--1.3--1.4.sql
/u01/app/postgres/product/13/db_1/share/extension/pgautofailover--1.4--dummy.sql
/u01/app/postgres/product/13/db_1/share/extension/pgautofailover--1.4.sql
/u01/app/postgres/product/13/db_1/share/extension/pgautofailover.control
postgres@pgaf1:~$ ls /u01/app/postgres/product/13/db_1/bin/*auto*
/u01/app/postgres/product/13/db_1/bin/pg_autoctl

Having that available we’ll need to initialize the pg_auto_failover monitor which is responsible for assigning roles and health-checking. We’ll do that in the first node:

postgres@pgaf1:~$ export PGDATA=/u02/pgdata/13/monitor
postgres@pgaf1:~$ export PGPORT=5433
postgres@pgaf1:~$ pg_autoctl create monitor --ssl-self-signed --hostname pgaf1.it.dbi-services.com --auth trust --run
14:45:40 13184 INFO  Using default --ssl-mode "require"
14:45:40 13184 INFO  Using --ssl-self-signed: pg_autoctl will create self-signed certificates, allowing for encrypted network traffic
14:45:40 13184 WARN  Self-signed certificates provide protection against eavesdropping; this setup does NOT protect against Man-In-The-Middle attacks nor Impersonation attacks.
14:45:40 13184 WARN  See https://www.postgresql.org/docs/current/libpq-ssl.html for details
14:45:40 13184 INFO  Initialising a PostgreSQL cluster at "/u02/pgdata/13/monitor"
14:45:40 13184 INFO  /u01/app/postgres/product/13/db_1/bin/pg_ctl initdb -s -D /u02/pgdata/13/monitor --option '--auth=trust'
14:45:42 13184 INFO   /usr/bin/openssl req -new -x509 -days 365 -nodes -text -out /u02/pgdata/13/monitor/server.crt -keyout /u02/pgdata/13/monitor/server.key -subj "/CN=pgaf1.it.dbi-services.com"
14:45:42 13184 INFO  Started pg_autoctl postgres service with pid 13204
14:45:42 13184 INFO  Started pg_autoctl listener service with pid 13205
14:45:42 13204 INFO   /u01/app/postgres/product/13/db_1/bin/pg_autoctl do service postgres --pgdata /u02/pgdata/13/monitor -v
14:45:42 13209 INFO   /u01/app/postgres/product/13/db_1/bin/postgres -D /u02/pgdata/13/monitor -p 5433 -h *
14:45:42 13205 ERROR Connection to database failed: could not connect to server: No such file or directory
14:45:42 13205 ERROR    Is the server running locally and accepting
14:45:42 13205 ERROR    connections on Unix domain socket "/tmp/.s.PGSQL.5433"?
14:45:42 13205 ERROR Failed to connect to local Postgres database at "port=5433 dbname=postgres", see above for details
14:45:42 13205 ERROR Failed to create user "autoctl" on local postgres server
14:45:42 13184 ERROR pg_autoctl service listener exited with exit status 12
14:45:42 13184 INFO  Restarting service listener
14:45:42 13204 INFO  Postgres is now serving PGDATA "/u02/pgdata/13/monitor" on port 5433 with pid 13209
14:45:43 13221 WARN  NOTICE:  installing required extension "btree_gist"
14:45:43 13221 INFO  Granting connection privileges on 192.168.22.0/24
14:45:43 13221 INFO  Your pg_auto_failover monitor instance is now ready on port 5433.
14:45:43 13221 INFO  Monitor has been successfully initialized.
14:45:43 13221 INFO   /u01/app/postgres/product/13/db_1/bin/pg_autoctl do service listener --pgdata /u02/pgdata/13/monitor -v
14:45:43 13221 INFO  Managing the monitor at postgres://autoctl_node@pgaf1.it.dbi-services.com:5433/pg_auto_failover?sslmode=require
14:45:43 13221 INFO  Reloaded the new configuration from "/home/postgres/.config/pg_autoctl/u02/pgdata/13/monitor/pg_autoctl.cfg"
14:45:44 13221 INFO  The version of extension "pgautofailover" is "1.4" on the monitor
14:45:44 13221 INFO  Contacting the monitor to LISTEN to its events.

This created a standard PostgreSQL cluster in the background:

postgres@pgaf1:~$ ls /u02/pgdata/13/monitor/
base              pg_dynshmem    pg_notify     pg_stat_tmp  pg_wal                         postmaster.opts
current_logfiles  pg_hba.conf    pg_replslot   pg_subtrans  pg_xact                        postmaster.pid
global            pg_ident.conf  pg_serial     pg_tblspc    postgresql.auto.conf           server.crt
log               pg_logical     pg_snapshots  pg_twophase  postgresql-auto-failover.conf  server.key
pg_commit_ts      pg_multixact   pg_stat       PG_VERSION   postgresql.conf                startup.log
postgres@pgaf1:~$ ps -ef | grep "postgres \-D"
postgres 13209 13204  0 14:45 pts/0    00:00:00 /u01/app/postgres/product/13/db_1/bin/postgres -D /u02/pgdata/13/monitor -p 5433 -h *

Before we can initialize the primary instance we need to get the connection string to the monitor:

postgres@pgaf1:~$ pg_autoctl show uri --monitor --pgdata /u02/pgdata/13/monitor/
postgres://autoctl_node@pgaf1.it.dbi-services.com:5433/pg_auto_failover?sslmode=require

Create the primary:

postgres@pgaf1:~$ pg_autoctl create postgres \
>     --hostname pgaf1.it.dbi-services.com \
>     --auth trust \
>     --ssl-self-signed \
>     --monitor 'postgres://autoctl_node@pgaf1.it.dbi-services.com:5433/pg_auto_failover?sslmode=require' \
>     --run
14:52:11 13354 INFO  Using default --ssl-mode "require"
14:52:11 13354 INFO  Using --ssl-self-signed: pg_autoctl will create self-signed certificates, allowing for encrypted network traffic
14:52:11 13354 WARN  Self-signed certificates provide protection against eavesdropping; this setup does NOT protect against Man-In-The-Middle attacks nor Impersonation attacks.
14:52:11 13354 WARN  See https://www.postgresql.org/docs/current/libpq-ssl.html for details
14:52:11 13354 INFO  Started pg_autoctl postgres service with pid 13356
14:52:11 13354 INFO  Started pg_autoctl node-active service with pid 13357
14:52:11 13356 INFO   /u01/app/postgres/product/13/db_1/bin/pg_autoctl do service postgres --pgdata /u02/pgdata/13/PG1 -v
14:52:11 13357 INFO  Registered node 1 (pgaf1.it.dbi-services.com:5432) with name "node_1" in formation "default", group 0, state "single"
14:52:11 13357 INFO  Writing keeper state file at "/home/postgres/.local/share/pg_autoctl/u02/pgdata/13/PG1/pg_autoctl.state"
14:52:11 13357 INFO  Writing keeper init state file at "/home/postgres/.local/share/pg_autoctl/u02/pgdata/13/PG1/pg_autoctl.init"
14:52:11 13357 INFO  Successfully registered as "single" to the monitor.
14:52:11 13357 INFO  FSM transition from "init" to "single": Start as a single node
14:52:11 13357 INFO  Initialising postgres as a primary
14:52:11 13357 INFO  Initialising a PostgreSQL cluster at "/u02/pgdata/13/PG1"
14:52:11 13357 INFO  /u01/app/postgres/product/13/db_1/bin/pg_ctl initdb -s -D /u02/pgdata/13/PG1 --option '--auth=trust'
14:52:14 13357 INFO   /usr/bin/openssl req -new -x509 -days 365 -nodes -text -out /u02/pgdata/13/PG1/server.crt -keyout /u02/pgdata/13/PG1/server.key -subj "/CN=pgaf1.it.dbi-services.com"
14:52:14 13385 INFO   /u01/app/postgres/product/13/db_1/bin/postgres -D /u02/pgdata/13/PG1 -p 5432 -h *
14:52:14 13357 INFO  CREATE DATABASE postgres;
14:52:14 13356 INFO  Postgres is now serving PGDATA "/u02/pgdata/13/PG1" on port 5432 with pid 13385
14:52:14 13357 INFO  The database "postgres" already exists, skipping.
14:52:14 13357 INFO  CREATE EXTENSION pg_stat_statements;
14:52:14 13357 INFO   /usr/bin/openssl req -new -x509 -days 365 -nodes -text -out /u02/pgdata/13/PG1/server.crt -keyout /u02/pgdata/13/PG1/server.key -subj "/CN=pgaf1.it.dbi-services.com"
14:52:14 13357 INFO  Contents of "/u02/pgdata/13/PG1/postgresql-auto-failover.conf" have changed, overwriting
14:52:14 13357 INFO  Transition complete: current state is now "single"
14:52:14 13357 INFO  keeper has been successfully initialized.
14:52:14 13357 INFO   /u01/app/postgres/product/13/db_1/bin/pg_autoctl do service node-active --pgdata /u02/pgdata/13/PG1 -v
14:52:14 13357 INFO  Reloaded the new configuration from "/home/postgres/.config/pg_autoctl/u02/pgdata/13/PG1/pg_autoctl.cfg"
14:52:14 13357 INFO  pg_autoctl service is running, current state is "single"

Repeating the same command on the second node (with a different –hostname) will initialize the first replica:

postgres@pgaf2:~$ export PGDATA=/u02/pgdata/13/PG1
postgres@pgaf2:~$ export PGPORT=5432
postgres@pgaf2:~$ pg_autoctl create postgres \
>     --hostname pgaf2.it.dbi-services.com \
>     --auth trust \
>     --ssl-self-signed \
>     --monitor 'postgres://autoctl_node@pgaf1.it.dbi-services.com:5433/pg_auto_failover?sslmode=require' \
>     --run
14:54:09 13010 INFO  Using default --ssl-mode "require"
14:54:09 13010 INFO  Using --ssl-self-signed: pg_autoctl will create self-signed certificates, allowing for encrypted network traffic
14:54:09 13010 WARN  Self-signed certificates provide protection against eavesdropping; this setup does NOT protect against Man-In-The-Middle attacks nor Impersonation attacks.
14:54:09 13010 WARN  See https://www.postgresql.org/docs/current/libpq-ssl.html for details
14:54:09 13010 INFO  Started pg_autoctl postgres service with pid 13012
14:54:09 13010 INFO  Started pg_autoctl node-active service with pid 13013
14:54:09 13012 INFO   /u01/app/postgres/product/13/db_1/bin/pg_autoctl do service postgres --pgdata /u02/pgdata/13/PG1 -v
14:54:09 13013 INFO  Registered node 2 (pgaf2.it.dbi-services.com:5432) with name "node_2" in formation "default", group 0, state "wait_standby"
14:54:09 13013 INFO  Writing keeper state file at "/home/postgres/.local/share/pg_autoctl/u02/pgdata/13/PG1/pg_autoctl.state"
14:54:09 13013 INFO  Writing keeper init state file at "/home/postgres/.local/share/pg_autoctl/u02/pgdata/13/PG1/pg_autoctl.init"
14:54:09 13013 INFO  Successfully registered as "wait_standby" to the monitor.
14:54:09 13013 INFO  FSM transition from "init" to "wait_standby": Start following a primary
14:54:09 13013 INFO  Transition complete: current state is now "wait_standby"
14:54:09 13013 INFO  New state for node 1 "node_1" (pgaf1.it.dbi-services.com:5432): single ➜ wait_primary
14:54:09 13013 INFO  New state for node 1 "node_1" (pgaf1.it.dbi-services.com:5432): wait_primary ➜ wait_primary
14:54:09 13013 INFO  Still waiting for the monitor to drive us to state "catchingup"
14:54:09 13013 WARN  Please make sure that the primary node is currently running `pg_autoctl run` and contacting the monitor.
14:54:09 13013 INFO  FSM transition from "wait_standby" to "catchingup": The primary is now ready to accept a standby
14:54:09 13013 INFO  Initialising PostgreSQL as a hot standby
14:54:09 13013 INFO   /u01/app/postgres/product/13/db_1/bin/pg_basebackup -w -d application_name=pgautofailover_standby_2 host=pgaf1.it.dbi-services.com port=5432 user=pgautofailover_replicator sslmode=require --pgdata /u02/pgdata/13/backup/node_2 -U pgautofailover_replicator --verbose --progress --max-rate 100M --wal-method=stream --slot pgautofailover_standby_2
14:54:09 13013 INFO  pg_basebackup: initiating base backup, waiting for checkpoint to complete
14:54:15 13013 INFO  pg_basebackup: checkpoint completed
14:54:15 13013 INFO  pg_basebackup: write-ahead log start point: 0/2000028 on timeline 1
14:54:15 13013 INFO  pg_basebackup: starting background WAL receiver
14:54:15 13013 INFO      0/23396 kB (0%), 0/1 tablespace (...ta/13/backup/node_2/backup_label)
14:54:16 13013 INFO   1752/23396 kB (7%), 0/1 tablespace (...ata/13/backup/node_2/base/1/2610)
14:54:16 13013 INFO  23406/23406 kB (100%), 0/1 tablespace (.../backup/node_2/global/pg_control)
14:54:16 13013 INFO  23406/23406 kB (100%), 1/1 tablespace                                         
14:54:16 13013 INFO  pg_basebackup:
14:54:16 13013 INFO   
14:54:16 13013 INFO  write-ahead log end point: 0/2000100
14:54:16 13013 INFO  pg_basebackup:
14:54:16 13013 INFO   
14:54:16 13013 INFO  waiting for background process to finish streaming ...
14:54:16 13013 INFO  pg_basebackup: syncing data to disk ...
14:54:17 13013 INFO  pg_basebackup: renaming backup_manifest.tmp to backup_manifest
14:54:17 13013 INFO  pg_basebackup: base backup completed
14:54:17 13013 INFO  Creating the standby signal file at "/u02/pgdata/13/PG1/standby.signal", and replication setup at "/u02/pgdata/13/PG1/postgresql-auto-failover-standby.conf"
14:54:17 13013 INFO   /usr/bin/openssl req -new -x509 -days 365 -nodes -text -out /u02/pgdata/13/PG1/server.crt -keyout /u02/pgdata/13/PG1/server.key -subj "/CN=pgaf2.it.dbi-services.com"
14:54:17 13021 INFO   /u01/app/postgres/product/13/db_1/bin/postgres -D /u02/pgdata/13/PG1 -p 5432 -h *
14:54:19 13013 INFO  PostgreSQL started on port 5432
14:54:19 13013 INFO  Fetched current list of 1 other nodes from the monitor to update HBA rules, including 1 changes.
14:54:19 13013 INFO  Ensuring HBA rules for node 1 "node_1" (pgaf1.it.dbi-services.com:5432)
14:54:19 13013 INFO  Transition complete: current state is now "catchingup"
14:54:20 13012 INFO  Postgres is now serving PGDATA "/u02/pgdata/13/PG1" on port 5432 with pid 13021
14:54:20 13013 INFO  keeper has been successfully initialized.
14:54:20 13013 INFO   /u01/app/postgres/product/13/db_1/bin/pg_autoctl do service node-active --pgdata /u02/pgdata/13/PG1 -v
14:54:20 13013 INFO  Reloaded the new configuration from "/home/postgres/.config/pg_autoctl/u02/pgdata/13/PG1/pg_autoctl.cfg"
14:54:20 13013 INFO  pg_autoctl service is running, current state is "catchingup"
14:54:20 13013 INFO  Fetched current list of 1 other nodes from the monitor to update HBA rules, including 1 changes.
14:54:20 13013 INFO  Ensuring HBA rules for node 1 "node_1" (pgaf1.it.dbi-services.com:5432)
14:54:21 13013 INFO  Monitor assigned new state "secondary"
14:54:21 13013 INFO  FSM transition from "catchingup" to "secondary": Convinced the monitor that I'm up and running, and eligible for promotion again
14:54:21 13013 INFO  Creating replication slot "pgautofailover_standby_1"
14:54:21 13013 INFO  Transition complete: current state is now "secondary"
14:54:21 13013 INFO  New state for node 1 "node_1" (pgaf1.it.dbi-services.com:5432): primary ➜ primary

The last lines of the output confirm, that pgaf1 is the primary cluster and pgaf2 now hosts a replica. Lets do the same on the third node:

postgres@pgaf3:~$ pg_autoctl create postgres \
>     --hostname pgaf3.it.dbi-services.com \
>     --auth trust \
>     --ssl-self-signed \
>     --monitor 'postgres://autoctl_node@pgaf1.it.dbi-services.com:5433/pg_auto_failover?sslmode=require' \
>     --run
14:57:19 12831 INFO  Using default --ssl-mode "require"
14:57:19 12831 INFO  Using --ssl-self-signed: pg_autoctl will create self-signed certificates, allowing for encrypted network traffic
14:57:19 12831 WARN  Self-signed certificates provide protection against eavesdropping; this setup does NOT protect against Man-In-The-Middle attacks nor Impersonation attacks.
14:57:19 12831 WARN  See https://www.postgresql.org/docs/current/libpq-ssl.html for details
14:57:19 12831 INFO  Started pg_autoctl postgres service with pid 12833
14:57:19 12831 INFO  Started pg_autoctl node-active service with pid 12834
14:57:19 12833 INFO   /u01/app/postgres/product/13/db_1/bin/pg_autoctl do service postgres --pgdata /u02/pgdata/13/PG1 -v
14:57:19 12834 INFO  Registered node 3 (pgaf3.it.dbi-services.com:5432) with name "node_3" in formation "default", group 0, state "wait_standby"
14:57:19 12834 INFO  Writing keeper state file at "/home/postgres/.local/share/pg_autoctl/u02/pgdata/13/PG1/pg_autoctl.state"
14:57:19 12834 INFO  Writing keeper init state file at "/home/postgres/.local/share/pg_autoctl/u02/pgdata/13/PG1/pg_autoctl.init"
14:57:19 12834 INFO  Successfully registered as "wait_standby" to the monitor.
14:57:19 12834 INFO  FSM transition from "init" to "wait_standby": Start following a primary
14:57:19 12834 INFO  Transition complete: current state is now "wait_standby"
14:57:19 12834 INFO  New state for node 1 "node_1" (pgaf1.it.dbi-services.com:5432): primary ➜ join_primary
14:57:20 12834 INFO  New state for node 1 "node_1" (pgaf1.it.dbi-services.com:5432): join_primary ➜ join_primary
14:57:20 12834 INFO  Still waiting for the monitor to drive us to state "catchingup"
14:57:20 12834 WARN  Please make sure that the primary node is currently running `pg_autoctl run` and contacting the monitor.
14:57:20 12834 INFO  FSM transition from "wait_standby" to "catchingup": The primary is now ready to accept a standby
14:57:20 12834 INFO  Initialising PostgreSQL as a hot standby
14:57:20 12834 INFO   /u01/app/postgres/product/13/db_1/bin/pg_basebackup -w -d application_name=pgautofailover_standby_3 host=pgaf1.it.dbi-services.com port=5432 user=pgautofailover_replicator sslmode=require --pgdata /u02/pgdata/13/backup/node_3 -U pgautofailover_replicator --verbose --progress --max-rate 100M --wal-method=stream --slot pgautofailover_standby_3
14:57:20 12834 INFO  pg_basebackup: initiating base backup, waiting for checkpoint to complete
14:57:20 12834 INFO  pg_basebackup: checkpoint completed
14:57:20 12834 INFO  pg_basebackup: write-ahead log start point: 0/4000028 on timeline 1
14:57:20 12834 INFO  pg_basebackup: starting background WAL receiver
14:57:20 12834 INFO      0/23397 kB (0%), 0/1 tablespace (...ta/13/backup/node_3/backup_label)
14:57:20 12834 INFO  23406/23406 kB (100%), 0/1 tablespace (.../backup/node_3/global/pg_control)
14:57:20 12834 INFO  23406/23406 kB (100%), 1/1 tablespace                                         
14:57:20 12834 INFO  pg_basebackup: write-ahead log end point: 0/4000100
14:57:20 12834 INFO  pg_basebackup: waiting for background process to finish streaming ...
14:57:20 12834 INFO  pg_basebackup: syncing data to disk ...
14:57:22 12834 INFO  pg_basebackup: renaming backup_manifest.tmp to backup_manifest
14:57:22 12834 INFO  pg_basebackup: base backup completed
14:57:22 12834 INFO  Creating the standby signal file at "/u02/pgdata/13/PG1/standby.signal", and replication setup at "/u02/pgdata/13/PG1/postgresql-auto-failover-standby.conf"
14:57:22 12834 INFO   /usr/bin/openssl req -new -x509 -days 365 -nodes -text -out /u02/pgdata/13/PG1/server.crt -keyout /u02/pgdata/13/PG1/server.key -subj "/CN=pgaf3.it.dbi-services.com"
14:57:22 12841 INFO   /u01/app/postgres/product/13/db_1/bin/postgres -D /u02/pgdata/13/PG1 -p 5432 -h *
14:57:22 12834 INFO  PostgreSQL started on port 5432
14:57:22 12834 INFO  Fetched current list of 2 other nodes from the monitor to update HBA rules, including 2 changes.
14:57:22 12834 INFO  Ensuring HBA rules for node 1 "node_1" (pgaf1.it.dbi-services.com:5432)
14:57:22 12834 INFO  Ensuring HBA rules for node 2 "node_2" (pgaf2.it.dbi-services.com:5432)
14:57:22 12834 ERROR Connection to database failed: could not connect to server: No such file or directory
14:57:22 12834 ERROR    Is the server running locally and accepting
14:57:22 12834 ERROR    connections on Unix domain socket "/tmp/.s.PGSQL.5432"?
14:57:22 12834 ERROR Failed to connect to local Postgres database at "port=5432 dbname=postgres", see above for details
14:57:22 12834 ERROR Failed to reload the postgres configuration after adding the standby user to pg_hba
14:57:22 12834 ERROR Failed to update the HBA entries for the new elements in the our formation "default" and group 0
14:57:22 12834 ERROR Failed to update HBA rules after a base backup
14:57:22 12834 ERROR Failed to transition from state "wait_standby" to state "catchingup", see above.
14:57:22 12831 ERROR pg_autoctl service node-active exited with exit status 12
14:57:22 12831 INFO  Restarting service node-active
14:57:22 12845 INFO  Continuing from a previous `pg_autoctl create` failed attempt
14:57:22 12845 INFO  PostgreSQL state at registration time was: PGDATA does not exists
14:57:22 12845 INFO  FSM transition from "wait_standby" to "catchingup": The primary is now ready to accept a standby
14:57:22 12845 INFO  Initialising PostgreSQL as a hot standby
14:57:22 12845 INFO  Target directory exists: "/u02/pgdata/13/PG1", stopping PostgreSQL
14:57:24 12833 INFO  Postgres is now serving PGDATA "/u02/pgdata/13/PG1" on port 5432 with pid 12841
14:57:24 12833 INFO  Stopping pg_autoctl postgres service
14:57:24 12833 INFO  /u01/app/postgres/product/13/db_1/bin/pg_ctl --pgdata /u02/pgdata/13/PG1 --wait stop --mode fast
14:57:24 12845 INFO   /u01/app/postgres/product/13/db_1/bin/pg_basebackup -w -d application_name=pgautofailover_standby_3 host=pgaf1.it.dbi-services.com port=5432 user=pgautofailover_replicator sslmode=require --pgdata /u02/pgdata/13/backup/node_3 -U pgautofailover_replicator --verbose --progress --max-rate 100M --wal-method=stream --slot pgautofailover_standby_3
14:57:24 12845 INFO  pg_basebackup:
14:57:24 12845 INFO   
14:57:24 12845 INFO  initiating base backup, waiting for checkpoint to complete
14:57:24 12845 INFO  pg_basebackup:
14:57:24 12845 INFO   
14:57:24 12845 INFO  checkpoint completed
14:57:24 12845 INFO  pg_basebackup:
14:57:24 12845 INFO   
14:57:24 12845 INFO  write-ahead log start point: 0/5000028 on timeline 1
14:57:24 12845 INFO  pg_basebackup:
14:57:24 12845 INFO   
14:57:24 12845 INFO  starting background WAL receiver
14:57:24 12845 INFO      0/23397 kB (0%), 0/1 tablespace (...ta/13/backup/node_3/backup_label)
14:57:25 12845 INFO  16258/23397 kB (69%), 0/1 tablespace (...3/backup/node_3/base/12662/12512)
14:57:25 12845 INFO  23406/23406 kB (100%), 0/1 tablespace (.../backup/node_3/global/pg_control)
14:57:25 12845 INFO  23406/23406 kB (100%), 1/1 tablespace                                         
14:57:25 12845 INFO  pg_basebackup: write-ahead log end point: 0/5000100
14:57:25 12845 INFO  pg_basebackup: waiting for background process to finish streaming ...
14:57:25 12845 INFO  pg_basebackup: syncing data to disk ...
14:57:27 12845 INFO  pg_basebackup:
14:57:27 12845 INFO   
14:57:27 12845 INFO  renaming backup_manifest.tmp to backup_manifest
14:57:27 12845 INFO  pg_basebackup:
14:57:27 12845 INFO   
14:57:27 12845 INFO  base backup completed
14:57:27 12845 INFO  Creating the standby signal file at "/u02/pgdata/13/PG1/standby.signal", and replication setup at "/u02/pgdata/13/PG1/postgresql-auto-failover-standby.conf"
14:57:27 12845 INFO   /usr/bin/openssl req -new -x509 -days 365 -nodes -text -out /u02/pgdata/13/PG1/server.crt -keyout /u02/pgdata/13/PG1/server.key -subj "/CN=pgaf3.it.dbi-services.com"
14:57:27 12881 INFO   /u01/app/postgres/product/13/db_1/bin/postgres -D /u02/pgdata/13/PG1 -p 5432 -h *
14:57:29 12845 INFO  PostgreSQL started on port 5432
14:57:29 12845 INFO  Fetched current list of 2 other nodes from the monitor to update HBA rules, including 2 changes.
14:57:29 12845 INFO  Ensuring HBA rules for node 1 "node_1" (pgaf1.it.dbi-services.com:5432)
14:57:29 12845 INFO  Ensuring HBA rules for node 2 "node_2" (pgaf2.it.dbi-services.com:5432)
14:57:29 12845 INFO  Transition complete: current state is now "catchingup"
14:57:29 12845 INFO  keeper has been successfully initialized.
14:57:29 12845 INFO   /u01/app/postgres/product/13/db_1/bin/pg_autoctl do service node-active --pgdata /u02/pgdata/13/PG1 -v
14:57:29 12845 INFO  Reloaded the new configuration from "/home/postgres/.config/pg_autoctl/u02/pgdata/13/PG1/pg_autoctl.cfg"
14:57:29 12845 INFO  pg_autoctl service is running, current state is "catchingup"
14:57:29 12845 INFO  Fetched current list of 2 other nodes from the monitor to update HBA rules, including 2 changes.
14:57:29 12845 INFO  Ensuring HBA rules for node 1 "node_1" (pgaf1.it.dbi-services.com:5432)
14:57:29 12845 INFO  Ensuring HBA rules for node 2 "node_2" (pgaf2.it.dbi-services.com:5432)
14:57:29 12845 INFO  Monitor assigned new state "secondary"
14:57:29 12845 INFO  FSM transition from "catchingup" to "secondary": Convinced the monitor that I'm up and running, and eligible for promotion again
14:57:29 12833 WARN  PostgreSQL was not running, restarted with pid 12881
14:57:29 12845 INFO  Creating replication slot "pgautofailover_standby_1"
14:57:29 12845 INFO  Creating replication slot "pgautofailover_standby_2"
14:57:29 12845 INFO  Transition complete: current state is now "secondary"
14:57:29 12845 INFO  New state for node 1 "node_1" (pgaf1.it.dbi-services.com:5432): primary ➜ primary

That really was quite simple. We now have two replicas synchronizing from the same primary:

postgres=# select usename,application_name,client_hostname,sent_lsn,write_lsn,flush_lsn,replay_lsn,write_lag from pg_stat_replication ;
          usename          |     application_name     |      client_hostname      | sent_lsn  | write_lsn | flush_lsn | replay_lsn | write_lag 
---------------------------+--------------------------+---------------------------+-----------+-----------+-----------+------------+-----------
 pgautofailover_replicator | pgautofailover_standby_2 | pgaf2.it.dbi-services.com | 0/6000148 | 0/6000148 | 0/6000148 | 0/6000148  | 
 pgautofailover_replicator | pgautofailover_standby_3 | pgaf3.it.dbi-services.com | 0/6000148 | 0/6000148 | 0/6000148 | 0/6000148  | 
(2 rows)

If you prepare that well, it is a matter of a few minutes and a setup like this is up and runnning. For the setup part, one bit is missing: All these pg_autoctl commands did not detach from the console, but run in the foreground and everything stops if we cancel the commands or close the terminal.

Luckily pg_auto_failover comes with a handy command to create a systemd service file:

postgres@pgaf1:~$ pg_autoctl -q show systemd --pgdata /u02/pgdata/13/monitor/ > pgautofailover.service
postgres@pgaf1:~$ cat pgautofailover.service
[Unit]
Description = pg_auto_failover

[Service]
WorkingDirectory = /home/postgres
Environment = 'PGDATA=/u02/pgdata/13/monitor/'
User = postgres
ExecStart = /u01/app/postgres/product/13/db_1/bin/pg_autoctl run
Restart = always
StartLimitBurst = 0

[Install]
WantedBy = multi-user.target

This can easily be added to systemd so the monitor will start automatically:

postgres@pgaf1:~$ sudo mv pgautofailover.service /etc/systemd/system
postgres@pgaf1:~$ sudo systemctl daemon-reload
postgres@pgaf1:~$ sudo systemctl enable pgautofailover.service
Created symlink /etc/systemd/system/multi-user.target.wants/pgautofailover.service → /etc/systemd/system/pgautofailover.service.
postgres@pgaf1:~$ sudo systemctl start pgautofailover.service

From now the service will start automatically when the node boots up. Lets do the same for the PostgreSQL clusters:

postgres@pgaf1:~$ pg_autoctl -q show systemd --pgdata /u02/pgdata/13/PG1/ > postgresp1.service
postgres@pgaf1:~$ cat postgresp1.service
[Unit]
Description = pg_auto_failover

[Service]
WorkingDirectory = /home/postgres
Environment = 'PGDATA=/u02/pgdata/13/PG1/'
User = postgres
ExecStart = /u01/app/postgres/product/13/db_1/bin/pg_autoctl run
Restart = always
StartLimitBurst = 0

[Install]
WantedBy = multi-user.target
postgres@pgaf1:~$ sudo mv postgresp1.service /etc/systemd/system
postgres@pgaf1:~$ sudo systemctl daemon-reload
postgres@pgaf1:~$ sudo systemctl enable postgresp1.service
Created symlink /etc/systemd/system/multi-user.target.wants/postgresp1.service → /etc/systemd/system/postgresp1.service.
postgres@pgaf1:~$ sudo systemctl start postgresp1.service

Do the same on the remaing two nodes and reboot all systems. If all went fine pg_auto_failover and the PostgreSQL cluster will come up automatically:

postgres@pgaf1:~$ pg_autoctl show state --pgdata /u02/pgdata/13/monitor/
  Name |  Node |                      Host:Port |       LSN | Reachable |       Current State |      Assigned State
-------+-------+--------------------------------+-----------+-----------+---------------------+--------------------
node_1 |     1 | pgaf1.it.dbi-services.com:5432 | 0/6002320 |       yes |             primary |             primary
node_2 |     2 | pgaf2.it.dbi-services.com:5432 | 0/6002320 |       yes |           secondary |           secondary
node_3 |     3 | pgaf3.it.dbi-services.com:5432 | 0/6002320 |       yes |           secondary |           secondary

That’s it for the first part. In the next post we’ll look at how robust pg_auto_failover is, e.g. what happens when the first node, which also runs the monitor, goes down?

Cet article Easy failover and switchover with pg_auto_failover est apparu en premier sur Blog dbi services.

↧

Recovery in the ☁ with Google Cloud SQL (PostgreSQL)

December 11, 2020, 12:57 am

≫ Next: pg_auto_failover: Failover and switchover scenarios

≪ Previous: Easy failover and switchover with pg_auto_failover

By Franck Pachot

.
In a previous post I started this series of “Recovery in the ” with the Oracle Autonomous database. My goal is to explain the recovery procedures, especially the Point-In-Time recovery procedures because there is often confusion, which I tried to clarify in What is a database backup (back to the basics). And the terms used in managed cloud services or documentation is not very clear, not always the same, and sometimes misleading.

For example, Google Cloud SQL documentation says: “Backups are lightweight; they provide a way to restore the data on your instance to its state at the time you took the backup” and this is right (you can also restore to another instance). The same documentation mentions a bit later that “Point-in-time recovery helps you recover an instance to a specific point in time”. So all information is correct here. But misleading the way it is put: mentioning backups (i.e how the protection is implemented) for one and recovery (i.e how protection is used) for the other. In my opinion, the cloud practitioner should not be concerned by backups in a managed database. Of course, the cloud architect must know how it works. But only the recovery should be exposed to the user. Backups are what the cloud provider runs to ensure the recovery SLA. Here the term backup actually means “restore point”: the only point-in-time you can recover when point-in-time recovery is not enabled. But backups are actually used for both. The point-in-time recovery option just enables additional backup (the WAL/redo).

PostgreSQL

I have created a PostgreSQL instance on the Google Cloud (the service “Google Cloud SQL” offers MySQL, PostgreSQL and SQLServer):

You can see that I enabled “Automate backups” with a time window where they can occur (daily backups) by keeping the default. And “Enable point-in-time recovery”, which is not enabled by default.

Point in Time Recovery

I can understand the reason why it is not enabled by default: enabling it requires more storage for the backups and it is fair not to activate by default a more expensive option. However, I think that when you choose a SQL database, you opt for persistence and durability and expect your database to be protected. I’m not talking only about daily snapshots of the database. All transactions must be protected. Any component can fail and without it, a failure compromises durability.

From my consulting experience and contribution in database forums, I know how people read this. They see “backup” enabled and then think they are protected. It is a managed service, they may not know that their transactions are not protected if they don’t enable WAL archiving. And when they will discover it, it will be too late. I have seen too many databases where recovery settings do not fit what users expect. If I were to design this GUI, with my DBA wisdom, either I would put point-in-time recover as a default, or show a red warning saying: with this default you save storage but will lose transactions if you need to recover.

Here, I have enabled the option “Enable point-in-time recovery” which is clearly described: Allows you to recover data from a specific point in time, down to a fraction of a second, via write-ahead log archiving. Make sure your storage can support at least 7 days of logs. We will see later what happens if storage cannot support 7 days.

I’ve created a simple table, similar to what I did on DigitalOcean to understand their recovery possibilities in this post.


postgres=> create table DEMO as select current_timestamp ts;
SELECT 1
postgres=> select * from DEMO;
              ts
-------------------------------
 2020-12-09 18:08:24.818999+00
(1 row)

I have created a simple table with a timestamp


while true ; do
 PGUSER=postgres PGPASSWORD="**P455w0rd**" psql -h 34.65.91.234 postgres <<<'insert into DEMO select current_timestamp;'
sleep 15 ; done

This connects and inserts one row every 15 seconds.


[opc@a aws]$ PGUSER=postgres PGPASSWORD="**P455w0rd**" psql -h 34.65.91.234 postgres <<<'select max(ts) from DEMO;' | ts
Dec 09 21:53:25               max
Dec 09 21:53:25 -------------------------------
Dec 09 21:53:25  2020-12-09 20:53:16.008487+00
Dec 09 21:53:25 (1 row)
Dec 09 21:53:25

I’m interested to see the last value, especially with I’ll do point-in-time recovery.


[opc@a aws]$ PGUSER=postgres PGPASSWORD="**P455w0rd**" psql -h 34.65.91.234 postgres | ts
insert into DEMO select current_timestamp returning *;
Dec 09 21:55:58               ts
Dec 09 21:55:58 -------------------------------
Dec 09 21:55:58  2020-12-09 20:55:58.959696+00
Dec 09 21:55:58 (1 row)
Dec 09 21:55:58
Dec 09 21:55:58 INSERT 0 1
insert into DEMO select current_timestamp returning *;
Dec 09 21:55:59               ts
Dec 09 21:55:59 -------------------------------
Dec 09 21:55:59  2020-12-09 20:55:59.170259+00
Dec 09 21:55:59 (1 row)
Dec 09 21:55:59
Dec 09 21:55:59 INSERT 0 1
insert into DEMO select current_timestamp returning *;
Dec 09 21:55:59               ts
Dec 09 21:55:59 -------------------------------
Dec 09 21:55:59  2020-12-09 20:55:59.395784+00
Dec 09 21:55:59 (1 row)
Dec 09 21:55:59
Dec 09 21:55:59 INSERT 0 1
insert into DEMO select current_timestamp returning *;
Dec 09 21:55:59               ts
Dec 09 21:55:59 -------------------------------
Dec 09 21:55:59  2020-12-09 20:55:59.572712+00
Dec 09 21:55:59 (1 row)
Dec 09 21:55:59
Dec 09 21:55:59 INSERT 0 1

I have inserted more frequently a few more records and this is the point I want to recover to: 2020-12-09 20:55:59 where I expect to see the previous value comitted (20:55:58.959696).

You do a Point In Time recovery with a clone. This is where namings may be different between cloud providers and it is important to understand. You do a Point In Time recovery when you have an error that happened in the past: a table was dropped by mistake, the application updated the wrong data, because of a user error or application bug, maybe you need to check a past version of a stored procedure,… You want to recover the database to the state just before this error. But you also want to keep the modifications that happened later. And recovery is at database level (some databases offer tablespace subdivision) so it is all or none. Then, can’t overwrite the current database. You keep it running, at its current state, and do your point-in-time recovery into another one. Actually, even with databases with fast point-in-time recovery (PITR), like Oracle Flashback Database or Aurora Backtrack, I did in-place PITR only for special cases: CI test database, or prod during an offline application release. But usually production databases have transactions coming that you don’t want to lose.

Then, with out-of-place PITR, you have access to the current state and the previous state and merge what you have to merge in order to keep the current state but with errors corrected from the past state. This is a copy of the database from a previous state and this is called a clone, it will create a new database instance, that you will keep at least the time you need to compare, analyze, export, and correct the error. So… do not search for a “recover” button. This is in the CLONE action.

The “Create a clone has to options”: “Clone current state of instance” and “Clone from an earlier point in time”. The first one is not about recovery because there’s no error to recover, just the need to get a copy. The second is the Point In Time recovery.

So yes, this operation is possible because you enabled “Point in Time Recovery” and “Point in Time Recovery” (PITR) is what you want to do. But, in order to do that, you go to the “Clone” menu and you click on “Clone”. Again, it makes sense, it is a clone, but I think it can be misleading. Especially when the first time you go to this menu is when a mistake has been made and you are under stress to repair.

When you select “Clone from an earlier point in time” you choose the point in time with a precision of one second. This is where you select the latest point just before the failure. I’ll choose 2020-12-09 20:55:59 or – as this is American, 2020-12-09 8:55:59 PM.

While it runs, it can take time because the whole database is cloned even if you need only part of it, I’ll mention two things. The first one is that you have a granularity of 1 second in the GUI and can even go further with CLI. The second one is that you can restore to a point in time that is even a few minutes before the current one. This is obvious when you work on on-premises databases because you know the WAL is there, but not all managed databases allow that. For example, in the previous post on Oracle Autonomous Database I got a message telling me that “the timestamp specified is not at least 2 hours in the past”. Here at Dec 9, 2020, 10:12:43 PM I’m creating a clone of the 2020-12-09 8:55:59 PM state with no problem.

Failed

Yes, my first PITR attempt failed. But that’s actually not bad because I’m testing the service, and that’s the occasion to see what happens and how to troubleshoot.

One bad thing (which is unfortunately common to many managed clouds as they try to show a simple interface that hides the complexity of a database system): no clue about what happened:

The message says “Failed to create or a fatal error during maintenance” and the SEE DETAILS has the following details: “An unknown error occured”. Not very helpful.

But I have also 3 positive feedbacks. First, we have full access to the postgreSQL logs. There’s even a nice interface to browse them (see the screenshot) but I downloaded them as text to browse with vi 😉

From here I see no problem at all. Just a normal point-in-time recovery:


,INFO,"2020-12-09 21:59:18.236 UTC [1]: [10-1] db=,user= LOG:  aborting any active transactions",2020-12-09T21:59:18.237683Z
,INFO,"2020-12-09 21:59:18.232 UTC [1]: [9-1] db=,user= LOG:  received fast shutdown request",2020-12-09T21:59:18.235100Z
,INFO,"2020-12-09 21:59:17.457 UTC [1]: [8-1] db=,user= LOG:  received SIGHUP, reloading configuration files",2020-12-09T21:59:17.457731Z
,INFO,"2020-12-09 21:59:12.562 UTC [1]: [7-1] db=,user= LOG:  received SIGHUP, reloading configuration files",2020-12-09T21:59:12.567686Z
,INFO,"2020-12-09 21:59:11.436 UTC [1]: [6-1] db=,user= LOG:  database system is ready to accept connections",2020-12-09T21:59:11.437753Z
,INFO,"2020-12-09 21:59:11.268 UTC [11]: [11-1] db=,user= LOG:  archive recovery complete",2020-12-09T21:59:11.268715Z
,INFO,"2020-12-09 21:59:11.105 UTC [11]: [10-1] db=,user= LOG:  selected new timeline ID: 2",2020-12-09T21:59:11.106147Z
,INFO,"2020-12-09 21:59:11.040 UTC [11]: [9-1] db=,user= LOG:  last completed transaction was at log time 2020-12-09 19:55:58.056897+00",2020-12-09T21:59:11.041372Z
,INFO,"2020-12-09 21:59:11.040 UTC [11]: [8-1] db=,user= LOG:  redo done at 0/123F71D0",2020-12-09T21:59:11.041240Z
,INFO,"2020-12-09 21:59:11.040 UTC [11]: [7-1] db=,user= LOG:  recovery stopping before commit of transaction 122997, time 2020-12-09 19:56:03.057621+00",2020-12-09T21:59:11.040940Z
,INFO,"2020-12-09 21:59:10.994 UTC [11]: [6-1] db=,user= LOG:  restored log file ""000000010000000000000012"" from archive",2020-12-09T21:59:10.996445Z
,INFO,"2020-12-09 21:59:10.900 UTC [1]: [5-1] db=,user= LOG:  database system is ready to accept read only connections",2020-12-09T21:59:10.900859Z
,INFO,"2020-12-09 21:59:10.899 UTC [11]: [5-1] db=,user= LOG:  consistent recovery state reached at 0/11000288",2020-12-09T21:59:10.899960Z
,ALERT,"2020-12-09 21:59:10.896 UTC [32]: [1-1] db=cloudsqladmin,user=cloudsqladmin FATAL:  the database system is starting up",2020-12-09T21:59:10.896214Z
,INFO,"2020-12-09 21:59:10.894 UTC [11]: [4-1] db=,user= LOG:  redo starts at 0/11000028",2020-12-09T21:59:10.894908Z
,INFO,"2020-12-09 21:59:10.852 UTC [11]: [3-1] db=,user= LOG:  restored log file ""000000010000000000000011"" from archive",2020-12-09T21:59:10.852640Z
,INFO,"2020-12-09 21:59:10.751 UTC [11]: [2-1] db=,user= LOG:  starting point-in-time recovery to 2020-12-09 19:55:59+00",2020-12-09T21:59:10.764881Z
,ALERT,"2020-12-09 21:59:10.575 UTC [21]: [1-1] db=cloudsqladmin,user=cloudsqladmin FATAL:  the database system is starting up",2020-12-09T21:59:10.576173Z
,ALERT,"2020-12-09 21:59:10.570 UTC [20]: [1-1] db=cloudsqladmin,user=cloudsqladmin FATAL:  the database system is starting up",2020-12-09T21:59:10.571169Z
,ALERT,"2020-12-09 21:59:10.566 UTC [19]: [1-1] db=cloudsqladmin,user=cloudsqladmin FATAL:  the database system is starting up",2020-12-09T21:59:10.567159Z
,ALERT,"2020-12-09 21:59:10.563 UTC [18]: [1-1] db=cloudsqladmin,user=cloudsqladmin FATAL:  the database system is starting up",2020-12-09T21:59:10.563188Z
,ALERT,"2020-12-09 21:59:10.560 UTC [17]: [1-1] db=cloudsqladmin,user=cloudsqladmin FATAL:  the database system is starting up",2020-12-09T21:59:10.560293Z
,ALERT,"2020-12-09 21:59:10.540 UTC [16]: [1-1] db=cloudsqladmin,user=cloudsqladmin FATAL:  the database system is starting up",2020-12-09T21:59:10.540919Z
,ALERT,"2020-12-09 21:59:10.526 UTC [14]: [1-1] db=cloudsqladmin,user=cloudsqladmin FATAL:  the database system is starting up",2020-12-09T21:59:10.526218Z
,ALERT,"2020-12-09 21:59:10.524 UTC [15]: [1-1] db=cloudsqladmin,user=cloudsqladmin FATAL:  the database system is starting up",2020-12-09T21:59:10.524291Z
,INFO,"2020-12-09 21:59:10.311 UTC [11]: [1-1] db=,user= LOG:  database system was interrupted; last known up at 2020-12-08 23:29:48 UTC",2020-12-09T21:59:10.311491Z
,INFO,"2020-12-09 21:59:10.299 UTC [1]: [4-1] db=,user= LOG:  listening on Unix socket ""/pgsql/.s.PGSQL.5432""",2020-12-09T21:59:10.299742Z
,INFO,"2020-12-09 21:59:10.291 UTC [1]: [3-1] db=,user= LOG:  listening on IPv6 address ""::"", port 5432",2020-12-09T21:59:10.291347Z
,INFO,"2020-12-09 21:59:10.290 UTC [1]: [2-1] db=,user= LOG:  listening on IPv4 address ""0.0.0.0"", port 5432",2020-12-09T21:59:10.290905Z
,INFO,"2020-12-09 21:59:10.288 UTC [1]: [1-1] db=,user= LOG:  starting PostgreSQL 13.0 on x86_64-pc-linux-gnu, compiled by Debian clang version 10.0.1 , 64-bit",2020-12-09T21:59:10.289086Z

The last transaction recovered 2020-12-09 19:55:58.056897+00 and this is exactly what I expected as my point-in-time was 19:55:58 (yes I wanted to put 20:55:59 in order to see the transaction from one second ago, but having a look at the screenshot I forgot that I was in UTC+1 there 🤷‍♂️ )

While being there watching the logs I see many messages like ERROR: relation “pg_stat_statements” does not exist
It seems they use PMM, from Percona, to monitor. I’ve CREATE EXTENSION PG_STAT_STATEMENTS; to avoid filling the logs.

So, first thing that is awesome: recovery happens exactly as expected and we can see the full log. My unknown fatal problem happened later. But there’s another very positive point: I’m running with trial credits but tried to find some support. And someone from the billing support (not really their job) tried to help me. It was not really helpful in this case but always nice to find someone who tries to help and, transparently, tells you that he tries but not having all tech support access to go further. Thanks Dan.

And I mentioned a third thing that is positive. Knowing that this unexpected error happened after the recovery, I just tried again while Dan was looking if some more information was available. And it worked (so I didn’t distrurb the billing support anymore). So I was just unlucky probably.

Second try

The second try was sucessful. Here is the log of operations (I started the clone at 22:24 – I mean 10:24 PM, GMT+1 so actually 21:24 UTC…):


Dec 9, 2020, 11:00:30 PM	Backup	Backup finished
Dec 9, 2020, 10:54:22 PM	Clone	Clone finished

Great, a backup was initiated just after the clone. My clone is protected (and point-in-time recovery is enabled by default here, like in the source)

Let’s check the log:

,INFO,"2020-12-09 21:59:10.751 UTC [11]: [2-1] db=,user= LOG:  starting point-in-time recovery to 2020-12-09 19:55:59+00",2020-12-09T21:59:10.764881Z

Yes, again I didn’t realize yet that I entered GMT+1 but no worry, I trust the postgreSQL logs.

I check quickly the last record in my table there in the clone:


Your Cloud Platform project in this session is set to disco-abacus-161115.
Use “gcloud config set project [PROJECT_ID]” to change to a different project.

franck@cloudshell:~ (disco-abacus-161115)$ PGUSER=postgres PGPASSWORD="**P455w0rd**" psql -h 34.65.191.96   postgres

psql (13.1 (Debian 13.1-1.pgdg100+1), server 13.0)
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)
Type "help" for help.

postgres=> select max(ts) from demo;

              max
-------------------------------
 2020-12-09 19:55:51.229302+00
(1 row)

postgres=>

19:55:51 is ok for a recovery at 19:55:59 as I insert every 15 seconds – this was my last transaction at this point in time. PITR is ok.

Disabling PITR

I order to test the recovery without the point-in-time recovery enabled, I disabled it. This requires a database restart but I have not seen any warning, so be careful when you change something.

I check the log to see the restart, and actually I see two of them:

And may be the reason:


LOG:  parameter "archive_mode" cannot be changed without restarting the server

Yes, that’s the PostgreSQL message, but… there’s more:


LOG:  configuration file "/pgsql/data/postgresql.conf" contains errors; unaffected changes were applied

Ok.. this explains why there were another restart: remove the wrong settings?

No, apparently, “Point-in-time recovery” is Disabled from the console and in the engine as well:


[opc@a gcp]$ PGUSER=postgres PGPASSWORD="**P455w0rd**" psql -h 34.65.191.96   postgres
psql (12.4, server 13.0)
WARNING: psql major version 12, server major version 13.
         Some psql features might not work.
SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES128-GCM-SHA256, bits: 128, compression: off)
Type "help" for help.

postgres=> show archive_mode;
 archive_mode
--------------
 off
(1 row)

postgres=> show archive_command;
 archive_command
-----------------
 (disabled)

so all good finally.

Yes, enabling PITR is actually setting archive_mode and archive_command (if you don’t already know postgresqlco.nf I suggest you follow the links)

Recovery without point-in-time

Now that PITR is disabled, the “Clone from an earlier point in time” is disabled, which is very good to not mislead you:

You have backups but cannot use them to clone. I like that the GUI makes it very clear: when you restore a backup you do it either in-place or to another instance that you have created before. We are not in clone creation here. We erase an existing database. And there are many warnings and confirmation: no risk.


[opc@a gcp]$ PGUSER=postgres PGPASSWORD="**P455w0rd**" psql -h 34.65.191.96   postgres
psql (12.4, server 13.0)
WARNING: psql major version 12, server major version 13.
         Some psql features might not work.
SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES128-GCM-SHA256, bits: 128, compression: off)
Type "help" for help.

postgres=> select max(ts) from DEMO;
              max
-------------------------------
 2020-12-09 23:29:45.363173+00
(1 row)

I selected the backup from 12:29:40 and in my GMT+1 timezone and here is my database state from 23:29:45
time when the backup finished. All perfect.

About PITR and WAL size…

I mentioned earlier that enabling Point In Time recovery uses more storage for the WAL. By default the storage for the database auto-increases. So the risk is only to pay more than expected. Then it is better to monitor it. For this test, I disabled Auto storage increase” which is displayed with a warning for a good reason. PostgreSQL does not like a full filesystem and here I’ll show the consequence.


postgres=> show archive_mode;

 archive_mode
--------------
 on
(1 row)

postgres=> show archive_command;
                                             archive_command
---------------------------------------------------------------------------------------------------------
 /utils/replication_log_processor -disable_log_to_disk -action=archive -file_name=%f -local_file_path=%p
(1 row)

I’m checking, from the database that WAL archiving is on. I have inserted a few millions of rows in my demo table and will run an update to generate lot of WAL:


explain (analyze, wal) update DEMO set ts=current_timestamp;
                                                          QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
 Update on demo  (cost=0.00..240401.80 rows=11259904 width=14) (actual time=199387.841..199387.842 rows=0 loops=1)
   WAL: records=22519642 fpi=99669 bytes=1985696687
   ->  Seq Scan on demo  (cost=0.00..240401.80 rows=11259904 width=14) (actual time=1111.600..8377.371 rows=11259904 loops=1)
 Planning Time: 0.216 ms
 Execution Time: 199389.368 ms
(5 rows)

vacuum DEMO;
VACUUM

With PostgreSQL 13 it is easy to see the measure the amount of WAL generated to protect the changes: 2 GB here so my 15GB storage will quickly be full.

When the storage reached 15TB my query failed in:


WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited
 abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.
SSL SYSCALL error: EOF detected
connection to server was lost
psql: error: could not connect to server: FATAL:  the database system is in recovery mode
FATAL:  the database system is in recovery mode

I’m used to Oracle where the database hangs in that case (if it can’t protect the changes by generating redo, it cannot accept new changes). But with PostgreSQL the instance crashes when there is no space in the filesystem:

And here, the problem is that, after a while, I cannot change anything, like increasing the storage. The instance is in a failure state (“Failed to create or a fatal error occurred during maintenance”) from the cloud point of view. I can’t even clone the database to another one. I can delete some backups to reclaim space but I tried too late when the instance was out of service (I tested on another identical test and was able to restart the instance when reclaiming space quickly enough). I think the only thing that I can do by myself (without cloud ops intervention) is restore the last backup. Fortunately, I’ve created a few manual back-ups as I wanted to see it shorten the recovery window. Because I’ve read that only 7 backups are kept, but those are the daily automatic ones, so the recovery window is 7 days (by default, you can bring it up to 365). You create manual backups when you don’t have PITR and need a restore point (like before an application release or a risky maintenance for example). Or even with PITR enabled and want to reduce the recovery time.

I cannot restore in place, getting the following message “You can’t restore an instance from a backup if it has replicas. To resolve, you can delete the replicas.” Anyway, I’ll never recommend to restore in-place even when you think you cannot do anything else. You never know. Here I am sure that the database is recoverable without data loss. I have backups, I have WAL, and they were fsync’d at commit. Actually, after deleting some backups to reclaim space, what I see in the postgres log looks good. So if this happens to you, contact immediately the support and I guess the cloud ops can check the state and bring it back to operational.

So always keep the failed instance just in case the support can get your data back. And we are in the cloud, provisioning a new instance for a few days is not a problem. I have created a new instance and restore the backup from Dec 10, 2020, 6:54:07 PM to it. I must say that at that point I’ve no idea at which state it will be restored. On one hand I’m in the RESTORE BACKUP action, not point-in-time recovery. But I know that WAL is available up to the point of failure because PITR was enabled. It is always very important to rehearse the recovery scenarios and it is even more critical in a managed cloud because what you know is possible technically may not be possible through the service.


franck@cloudshell:~ (disco-abacus-161115)$ PGUSER=postgres PGPASSWORD="**P455w0rd**" psql -h 34.65.38.32
psql (13.1 (Debian 13.1-1.pgdg100+1), server 13.0)
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)
Type "help" for help.

postgres=> select max(ts) from demo;
              max
-------------------------------
 2020-12-10 17:54:22.874403+00
(1 row)

postgres=>

This is the backup time so no recovery. Even if the WAL are there, they are not applied and this is confirmed by the PostgreSQL which shows no point-in-time recovery:

As you can see, I start to be a master in querying the Google Cloud logs and didn’t export them to a file

So, because there are no WAL, I think that backups are taken with a consistent filesystem snapshot.

Summary

Here is my takeout from those tests. I really like how the recovery possibilities are presented even if I would prefer “backup” to be named “restore point” to avoid any confusion. But it is really good to differentiate the restore of a specific state with the possibility to clone from any point-in-time. I also like that the logical export/import (pg_dump) is in a different place than backup/recovery/clone because a dump is not a database backup. I like the simplicity of the interface, and the visibility of the log. Google Cloud is a really good platform for a managed PostgreSQL. And no surprise about the recovery window: when you enable point-in-time recovery, you can recover to any time, from many days ago (you configure it for your RPO requirement and the storage cost consequence) to the past second. But be careful with storage: don’t let it be full or it can be fatal. I think that auto-extensible storage is good, with thresholds and alerts of course to stay in control.

What I would see as a nice improvement would be a higher advocacy for point-in-time recovery, a big warning when a change requires the restart of the instance, better messages when something fails besides PostgreSQL, and a no-data-loss possibility to clone the current state even when the instance is broken. But as always, if you practice the recovery scenario in advance you will be well prepared when you need it in a critical and stressful situation. And remember I did this without contacting the database support and I’m convinced, given what I see in the logs, that they could recover my database without data loss. In a managed cloud, like on-premises, contact your DBA rather than guessing and trying things that may break all that further. I was only testing what is available from the console here.

Note that a backup RESTORE keeps the configuration of the destination instance (like PITR, firewall rules,…) but a clone has the same configuration as the source. This may not be what you want and then change it after a clone (maybe PITR is not needed for a test database, and maybe you want to allow different CIDR to connect to).

All these may be different in your context, and in future versions, so the main message of this post is that you should spend some time to understand and test recovery, even in a managed service.

Cet article Recovery in the ☁ with Google Cloud SQL (PostgreSQL) est apparu en premier sur Blog dbi services.

↧

pg_auto_failover: Failover and switchover scenarios

December 11, 2020, 1:43 am

≫ Next: Oracle 21c Security : Mandatory Profile

≪ Previous: Recovery in the ☁ with Google Cloud SQL (PostgreSQL)

In the last post we had a look at the installation and setup of pg_auto_failover. We currently have one primary cluster and two replicas synchronizing from this primary cluster. But we potentially also have an issue in the setup: The monitor is running beside the primary instance on the same node and if that nodes goes down the monitor is gone. What happens in that case and how can we avoid that? We also did not look at controlled switch-overs, and this is definitely something you want to have in production. From time to time you’ll need to do some maintenance on one of the nodes, and switching the primary cluster to another node is very handy in such situations. Lets start with the simple case and have a look at switch-overs first.

This is the current state of the setup:

postgres@pgaf1:~$ pg_autoctl show state --pgdata /u02/pgdata/13/monitor/
  Name |  Node |                      Host:Port |       LSN | Reachable |       Current State |      Assigned State
-------+-------+--------------------------------+-----------+-----------+---------------------+--------------------
node_1 |     1 | pgaf1.it.dbi-services.com:5432 | 0/6002408 |       yes |             primary |             primary
node_2 |     2 | pgaf2.it.dbi-services.com:5432 | 0/6002408 |       yes |           secondary |           secondary
node_3 |     3 | pgaf3.it.dbi-services.com:5432 | 0/6002408 |       yes |           secondary |           secondary

Before we attempt to do a switch-over you should be aware of your replication settings:

postgres@pgaf1:~$ pg_autoctl get formation settings --pgdata /u02/pgdata/13/monitor/
  Context |    Name |                   Setting | Value                                                       
----------+---------+---------------------------+-------------------------------------------------------------
formation | default |      number_sync_standbys | 1                                                           
  primary |  node_1 | synchronous_standby_names | 'ANY 1 (pgautofailover_standby_2, pgautofailover_standby_3)'
     node |  node_1 |        candidate priority | 50                                                          
     node |  node_2 |        candidate priority | 50                                                          
     node |  node_3 |        candidate priority | 50                                                          
     node |  node_1 |        replication quorum | true                                                        
     node |  node_2 |        replication quorum | true                                                        
     node |  node_3 |        replication quorum | true

What does this tell us:

synchronous_standby_names: We’re using synchronous replication and at least one of the two replicas need to confirm a commit (This is a PostgreSQL setting)
number_sync_standbys=1: That means at least one standby needs to confirm the commit (This is a pg_auto_failover setting)
candidate priority=50: This specifies which replica gets promoted. At the default setting of 50 all replicas have the same chance to be selected for promotion and the monitor will pick the one with the most advanced LSN. (This is a pg_auto_failover setting)
replication quorum=true: This mean synchronous replication, a values of false mean asynchronous replication. (This is a pg_auto_failover setting)

You maybe have noticed the “formation” keyword above. A formation is a set of PostgreSQL clusters that are managed together and that means you can use the same monitor to manage multiple sets of PostgreSQL clusters. We are using the default formation in this example.

Lets assume we need to do some maintenance on our primary node and therefore want to switch-over the primary instance to another node. The command to do that is simple:

postgres@pgaf1:~$ pg_autoctl perform switchover --pgdata /u02/pgdata/13/PG1/
16:10:05 15960 INFO  Targetting group 0 in formation "default"
16:10:05 15960 INFO  Listening monitor notifications about state changes in formation "default" and group 0
16:10:05 15960 INFO  Following table displays times when notifications are received
    Time |   Name |  Node |                      Host:Port |       Current State |      Assigned State
---------+--------+-------+--------------------------------+---------------------+--------------------
16:10:05 | node_1 |     1 | pgaf1.it.dbi-services.com:5432 |             primary |            draining
16:10:05 | node_1 |     1 | pgaf1.it.dbi-services.com:5432 |            draining |            draining
16:10:05 | node_2 |     2 | pgaf2.it.dbi-services.com:5432 |           secondary |          report_lsn
16:10:05 | node_3 |     3 | pgaf3.it.dbi-services.com:5432 |           secondary |          report_lsn
16:10:06 | node_2 |     2 | pgaf2.it.dbi-services.com:5432 |          report_lsn |          report_lsn
16:10:06 | node_3 |     3 | pgaf3.it.dbi-services.com:5432 |          report_lsn |          report_lsn
16:10:06 | node_2 |     2 | pgaf2.it.dbi-services.com:5432 |          report_lsn |   prepare_promotion
16:10:06 | node_2 |     2 | pgaf2.it.dbi-services.com:5432 |   prepare_promotion |   prepare_promotion
16:10:06 | node_2 |     2 | pgaf2.it.dbi-services.com:5432 |   prepare_promotion |    stop_replication
16:10:06 | node_1 |     1 | pgaf1.it.dbi-services.com:5432 |            draining |      demote_timeout
16:10:06 | node_3 |     3 | pgaf3.it.dbi-services.com:5432 |          report_lsn |      join_secondary
16:10:06 | node_1 |     1 | pgaf1.it.dbi-services.com:5432 |      demote_timeout |      demote_timeout
16:10:06 | node_3 |     3 | pgaf3.it.dbi-services.com:5432 |      join_secondary |      join_secondary
16:10:07 | node_2 |     2 | pgaf2.it.dbi-services.com:5432 |    stop_replication |    stop_replication
16:10:07 | node_2 |     2 | pgaf2.it.dbi-services.com:5432 |    stop_replication |        wait_primary
16:10:07 | node_1 |     1 | pgaf1.it.dbi-services.com:5432 |      demote_timeout |             demoted
16:10:07 | node_1 |     1 | pgaf1.it.dbi-services.com:5432 |             demoted |             demoted
16:10:07 | node_2 |     2 | pgaf2.it.dbi-services.com:5432 |        wait_primary |        wait_primary
16:10:07 | node_3 |     3 | pgaf3.it.dbi-services.com:5432 |      join_secondary |           secondary
16:10:07 | node_2 |     2 | pgaf2.it.dbi-services.com:5432 |        wait_primary |             primary
16:10:07 | node_1 |     1 | pgaf1.it.dbi-services.com:5432 |             demoted |          catchingup
16:10:07 | node_2 |     2 | pgaf2.it.dbi-services.com:5432 |        wait_primary |        join_primary
16:10:07 | node_2 |     2 | pgaf2.it.dbi-services.com:5432 |        join_primary |        join_primary
16:10:08 | node_3 |     3 | pgaf3.it.dbi-services.com:5432 |           secondary |           secondary
16:10:08 | node_1 |     1 | pgaf1.it.dbi-services.com:5432 |          catchingup |          catchingup
16:10:08 | node_1 |     1 | pgaf1.it.dbi-services.com:5432 |          catchingup |           secondary
16:10:08 | node_2 |     2 | pgaf2.it.dbi-services.com:5432 |        join_primary |             primary
16:10:08 | node_1 |     1 | pgaf1.it.dbi-services.com:5432 |           secondary |           secondary
16:10:08 | node_2 |     2 | pgaf2.it.dbi-services.com:5432 |             primary |             primary
postgres@pgaf1:~$

You’ll get the progress messages to the screen so you can actually see what happens. As the services are started with systemd you can also have a look at the journal:

-- Logs begin at Thu 2020-12-10 15:17:38 CET, end at Thu 2020-12-10 16:11:26 CET. --
Dec 10 16:10:08 pgaf1 pg_autoctl[327]: 16:10:08 399 INFO  FSM transition from "catchingup" to "secondary": Convinced the monitor that I'm
Dec 10 16:10:08 pgaf1 pg_autoctl[327]: 16:10:08 399 INFO  Transition complete: current state is now "secondary"
Dec 10 16:10:08 pgaf1 pg_autoctl[341]: 16:10:08 397 INFO  node 1 "node_1" (pgaf1.it.dbi-services.com:5432) reported new state "secondary"
Dec 10 16:10:08 pgaf1 pg_autoctl[341]: 16:10:08 397 INFO  New state for node 1 "node_1" (pgaf1.it.dbi-services.com:5432): secondary ➜ sec
Dec 10 16:10:08 pgaf1 pg_autoctl[327]: 16:10:08 399 INFO  New state for this node (node 1, "node_1") (pgaf1.it.dbi-services.com:5432): se
Dec 10 16:10:08 pgaf1 pg_autoctl[341]: 16:10:08 397 INFO  node 2 "node_2" (pgaf2.it.dbi-services.com:5432) reported new state "primary"
Dec 10 16:10:08 pgaf1 pg_autoctl[341]: 16:10:08 397 INFO  New state for node 2 "node_2" (pgaf2.it.dbi-services.com:5432): primary ➜ prima
Dec 10 16:10:08 pgaf1 pg_autoctl[327]: 16:10:08 399 INFO  New state for node 2 "node_2" (pgaf2.it.dbi-services.com:5432): primary ➜ prima

The second second node was selected as the new primary, and we can of course confirm that:

postgres@pgaf1:~$ pg_autoctl show state --pgdata /u02/pgdata/13/monitor/
  Name |  Node |                      Host:Port |       LSN | Reachable |       Current State |      Assigned State
-------+-------+--------------------------------+-----------+-----------+---------------------+--------------------
node_1 |     1 | pgaf1.it.dbi-services.com:5432 | 0/60026F8 |       yes |           secondary |           secondary
node_2 |     2 | pgaf2.it.dbi-services.com:5432 | 0/60026F8 |       yes |             primary |             primary
node_3 |     3 | pgaf3.it.dbi-services.com:5432 | 0/60026F8 |       yes |           secondary |           secondary

postgres@pgaf1:~$

Next test: What happens when we reboot a node that currently is running a replica? Lets reboot pgaf3 as this one is currently a replica, and it does not run the monitor:

postgres@pgaf3:~$ sudo reboot
postgres@pgaf3:~$ Connection to 192.168.22.192 closed by remote host.
Connection to 192.168.22.192 closed.

Watching at the state the “Reachable” status changes to “no” for the third instance and the LSN falls behind:

postgres@pgaf1:~$ pg_autoctl show state --pgdata /u02/pgdata/13/monitor/
  Name |  Node |                      Host:Port |       LSN | Reachable |       Current State |      Assigned State
-------+-------+--------------------------------+-----------+-----------+---------------------+--------------------
node_1 |     1 | pgaf1.it.dbi-services.com:5432 | 0/60026F8 |       yes |           secondary |           secondary
node_2 |     2 | pgaf2.it.dbi-services.com:5432 | 0/60026F8 |       yes |             primary |             primary
node_3 |     3 | pgaf3.it.dbi-services.com:5432 | 0/6000000 |        no |           secondary |           secondary

Once it is back, the replica is brought back to the configuration and all is fine:

postgres@pgaf1:~$ pg_autoctl show state --pgdata /u02/pgdata/13/monitor/
  Name |  Node |                      Host:Port |       LSN | Reachable |       Current State |      Assigned State
-------+-------+--------------------------------+-----------+-----------+---------------------+--------------------
node_1 |     1 | pgaf1.it.dbi-services.com:5432 | 0/60026F8 |       yes |           secondary |           secondary
node_2 |     2 | pgaf2.it.dbi-services.com:5432 | 0/60026F8 |       yes |             primary |             primary
node_3 |     3 | pgaf3.it.dbi-services.com:5432 | 0/6000000 |       yes |           secondary |           secondary

...
postgres@pgaf1:~$ pg_autoctl show state --pgdata /u02/pgdata/13/monitor/
  Name |  Node |                      Host:Port |       LSN | Reachable |       Current State |      Assigned State
-------+-------+--------------------------------+-----------+-----------+---------------------+--------------------
node_1 |     1 | pgaf1.it.dbi-services.com:5432 | 0/6013120 |       yes |           secondary |           secondary
node_2 |     2 | pgaf2.it.dbi-services.com:5432 | 0/6013120 |       yes |             primary |             primary
node_3 |     3 | pgaf3.it.dbi-services.com:5432 | 0/6013120 |       yes |           secondary |           secondary

But what happens if we shutdown the monitor node?

postgres@pgaf1:~$ sudo systemctl poweroff
postgres@pgaf1:~$ Connection to 192.168.22.190 closed by remote host.
Connection to 192.168.22.190 closed.

Checking the status on the node which currently hosts the primary cluster:

postgres@pgaf2:~$ pg_autoctl show state --pgdata /u02/pgdata/13/PG1/
10:26:52 1293 WARN  Failed to connect to "postgres://autoctl_node@pgaf1.it.dbi-services.com:5433/pg_auto_failover?sslmode=require", retrying until the server is ready
10:26:52 1293 ERROR Connection to database failed: timeout expired
10:26:52 1293 ERROR Failed to connect to "postgres://autoctl_node@pgaf1.it.dbi-services.com:5433/pg_auto_failover?sslmode=require" after 1 attempts in 2 seconds, pg_autoctl stops retrying now
10:26:52 1293 ERROR Failed to retrieve current state from the monitor

As the monitor is down we cannot anymore ask for status. The primary and the remaining replica cluster are still up and running but we lost the possibility to interact with pg_auto_failover. Booting up the monitor node brings is back into the game:

postgres@pgaf2:~$ pg_autoctl show state --pgdata /u02/pgdata/13/PG1/
  Name |  Node |                      Host:Port |       LSN | Reachable |       Current State |      Assigned State
-------+-------+--------------------------------+-----------+-----------+---------------------+--------------------
node_1 |     1 | pgaf1.it.dbi-services.com:5432 | 0/6000000 |       yes |           secondary |           secondary
node_2 |     2 | pgaf2.it.dbi-services.com:5432 | 0/6013240 |       yes |             primary |             primary
node_3 |     3 | pgaf3.it.dbi-services.com:5432 | 0/6013240 |       yes |           secondary |           secondary

This has a consequence: The monitor should not run on any of the PostgreSQL nodes but on a separate node which is dedicated to the monitor. As you can manage more than one HA setup with the same monitor this should not an issue, though. But this also means that the monitor is a single point of failure and the health of the monitor is critical for pg_auto_failover.

Cet article pg_auto_failover: Failover and switchover scenarios est apparu en premier sur Blog dbi services.

↧

Oracle 21c Security : Mandatory Profile

December 11, 2020, 9:21 am

≫ Next: Oracle write consistency bug and multi-thread de-queuing

≪ Previous: pg_auto_failover: Failover and switchover scenarios

With Oracle 21c, it is now possible to enforce a password policy (length, number of digits…) for all pluggable databases or for specific pluggable databases via profiles. This is done by creating a mandatory profile in the root CDB and this profile will be attached to corresponding PDBs.
The mandatory profile is a generic profile that can only have a single parameter, the PASSWORD_VERIFY_FUNCTION.
The password complexity verification function of the mandatory profile is checked before the password complexity function that is associated with the user account profile.
For example, the password length defined in the mandatory profile will take precedence on any other password length defined in any other profile associated to the user.
When defined the limit of the mandatory profile will be enforced in addition to the limits of the actual profile of the user.
A mandatory profile cannot be assigned to a user but should attached to a PDB

In this demonstration we will consider a instance DB21 with 3 PDB
-PDB1
-PDB2
-PDB3

We will create 2 mandatory profiles:
c##mand_profile_pdb1_pdb2 which will be assigned to PDB1 and PDB2
c##mand_profile_pdb3 which will be assigned to PDB3

SQL> show pdbs;

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB1                           READ WRITE NO
         4 PDB2                           READ WRITE NO
         5 PDB3                           READ WRITE NO
SQL>

We will create two verification functions in the root container that we will associate to our mandatory profiles. The first function will check for a password length to 6

SQL> CREATE OR REPLACE FUNCTION func_pdb1_2_verify_function
 ( username     varchar2,
   password     varchar2,
   old_password varchar2)
 return boolean IS
BEGIN
   if not ora_complexity_check(password, chars => 6) then
      return(false);
   end if;
   return(true);
END;
/  

Function created.

SQL>

The second function will check for a password length to 10

SQL> CREATE OR REPLACE FUNCTION func_pdb3_verify_function
 ( username     varchar2,
   password     varchar2,
   old_password varchar2)
 return boolean IS
BEGIN
   if not ora_complexity_check(password, chars => 10) then
      return(false);
      end if;
   return(true);
END;
/ 

Function created.

SQL>

Now let’s create the two mandatory profiles in the root container

SQL>
CREATE MANDATORY PROFILE c##mand_profile_pdb1_pdb2
LIMIT PASSWORD_VERIFY_FUNCTION func_pdb1_2_verify_function
CONTAINER = ALL;

Profile created.

Remember that we want to associate the mandatoty profile c##mand_profile_pdb1_pdb2 to PDB1 and PDB2. So we can first attach this profile to all PDBs

SQL> CREATE MANDATORY PROFILE c##mand_profile_pdb3
LIMIT PASSWORD_VERIFY_FUNCTION func_pdb3_verify_function
CONTAINER = ALL;  

Profile created.

Remember that we want to associate the mandatoty profile c##mand_profile_pdb1_pdb2 to PDB1 and PDB2. So we can first attach this profile to all PDBs

SQL> show con_name;

CON_NAME
------------------------------
CDB$ROOT

SQL> alter system set mandatory_user_profile=c##mand_profile_pdb1_pdb2;

System altered.

SQL>

To associate the profile c##mand_profile_pdb3 to PDB3, we can edit the spfile of PDB3

SQL> show con_name;

CON_NAME
------------------------------
PDB3
SQL>  alter system set mandatory_user_profile=c##mand_profile_pdb3;

System altered.

SQL>

We can then verify the different values of the parameter MANDATORY_USER_PROFILE in the different PDBs

SQL> show con_name;

CON_NAME
------------------------------
PDB3
SQL>  alter system set mandatory_user_profile=c##mand_profile_pdb3;

System altered.

SQL> show parameter mandatory;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
mandatory_user_profile               string      C##MAND_PROFILE_PDB3
SQL> alter session set container=PDB1;

Session altered.

SQL> show parameter mandatory;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
mandatory_user_profile               string      C##MAND_PROFILE_PDB1_PDB2
SQL>  alter session set container=PDB2;

Session altered.

SQL> show parameter mandatory;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
mandatory_user_profile               string      C##MAND_PROFILE_PDB1_PDB2
SQL>

To test we will try to create a user in PDB3 for example with with a password length < 10

SQL> create user toto identified by "DGDTr##5";
create user toto identified by "DGDTr##5"
*
ERROR at line 1:
ORA-28219: password verification failed for mandatory profile
ORA-20000: password length less than 10 characters


SQL>

Cet article Oracle 21c Security : Mandatory Profile est apparu en premier sur Blog dbi services.

↧

Oracle write consistency bug and multi-thread de-queuing

December 14, 2020, 1:21 am

≫ Next: Validate your SQL Server infrastructure with dbachecks

≪ Previous: Oracle 21c Security : Mandatory Profile

By Franck Pachot

.
This was initially posted on CERN Database blog where it seems to be lost. Here is a copy thanks to web.archive.org
Additional notes:
– I’ve tested and got the same behaviour in Oracle 21c
– you will probably enjoy reading Hatem Mahmoud going further on Write consistency and DML restart

Posted by Franck Pachot on Thursday, 27 September 2018

Here is a quick test I did after encountering an abnormal behavior in write consistency and before finding some references to a bug on StackOverflow (yes, write consistency questions on StackOverflow!) and AskTOM. And a bug opened by Tom Kyte in 2011, that is still there in 18c.

The original issue was with a task management system to run jobs. Here is the simple table where all rows have a ‘NEW’ status and the goal is to have several threads processing them by updating them to the ‘HOLDING’ status’ and adding the process name.


set echo on
drop table DEMO;
create table DEMO (ID primary key,STATUS,NAME,CREATED)
 as select rownum,cast('NEW' as varchar2(10)),cast(null as varchar2(10)),sysdate+rownum/24/60 from xmltable('1 to 10')
/

Now here is the query that selects the 5 oldest rows in status ‘NEW’ and updates them to the ‘HOLDING’ status:


UPDATE DEMO SET NAME = 'NUMBER1', STATUS = 'HOLDING' 
WHERE ID IN (
 SELECT ID FROM (
  SELECT ID, rownum as counter 
  FROM DEMO 
  WHERE STATUS = 'NEW' 
  ORDER BY CREATED
 ) 
WHERE counter <= 5) 
;

Note that the update also sets the name of the session which has processed the rows, here ‘NUMBER1’.

Once the query started, and before the commit, I’ve run the same query from another session, but with ‘NUMBER2’.


UPDATE DEMO SET NAME = 'NUMBER2', STATUS = 'HOLDING' 
WHERE ID IN (
 SELECT ID FROM (
  SELECT ID, rownum as counter 
  FROM DEMO 
  WHERE STATUS = 'NEW' 
  ORDER BY CREATED
 ) 
WHERE counter <= 5) 
;

Of course, this waits on row lock from the first session as it has selected the same rows. Then I commit the first session, and check, from the first session what has been updated:


commit;
set pagesize 1000
select versions_operation,versions_xid,DEMO.* from DEMO versions between scn minvalue and maxvalue order by ID,2;

V VERSIONS_XID             ID STATUS     NAME       CREATED        
- ---------------- ---------- ---------- ---------- ---------------
U 0500110041040000          1 HOLDING    NUMBER1    27-SEP-18 16:48
                            1 NEW                   27-SEP-18 16:48
U 0500110041040000          2 HOLDING    NUMBER1    27-SEP-18 16:49
                            2 NEW                   27-SEP-18 16:49
U 0500110041040000          3 HOLDING    NUMBER1    27-SEP-18 16:50
                            3 NEW                   27-SEP-18 16:50
U 0500110041040000          4 HOLDING    NUMBER1    27-SEP-18 16:51
                            4 NEW                   27-SEP-18 16:51
U 0500110041040000          5 HOLDING    NUMBER1    27-SEP-18 16:52
                            5 NEW                   27-SEP-18 16:52
                            6 NEW                   27-SEP-18 16:53
                            7 NEW                   27-SEP-18 16:54
                            8 NEW                   27-SEP-18 16:55
                            9 NEW                   27-SEP-18 16:56
                           10 NEW                   27-SEP-18 16:57

I have used flashback query to see all versions of the rows. All 10 have been created and the the first 5 of them have been updated by NUMBER1.

Now, my second session continues, updating to NUMBER2. I commit and look at the row versions again:


commit;
set pagesize 1000
select versions_operation,versions_xid,DEMO.* from DEMO versions between scn minvalue and maxvalue order by ID,2;


V VERSIONS_XID             ID STATUS     NAME       CREATED        
- ---------------- ---------- ---------- ---------- ---------------
U 04001B0057030000          1 HOLDING    NUMBER2    27-SEP-18 16:48
U 0500110041040000          1 HOLDING    NUMBER1    27-SEP-18 16:48
                            1 NEW                   27-SEP-18 16:48
U 04001B0057030000          2 HOLDING    NUMBER2    27-SEP-18 16:49
U 0500110041040000          2 HOLDING    NUMBER1    27-SEP-18 16:49
                            2 NEW                   27-SEP-18 16:49
U 04001B0057030000          3 HOLDING    NUMBER2    27-SEP-18 16:50
U 0500110041040000          3 HOLDING    NUMBER1    27-SEP-18 16:50
                            3 NEW                   27-SEP-18 16:50
U 04001B0057030000          4 HOLDING    NUMBER2    27-SEP-18 16:51
U 0500110041040000          4 HOLDING    NUMBER1    27-SEP-18 16:51
                            4 NEW                   27-SEP-18 16:51
U 04001B0057030000          5 HOLDING    NUMBER2    27-SEP-18 16:52
U 0500110041040000          5 HOLDING    NUMBER1    27-SEP-18 16:52
                            5 NEW                   27-SEP-18 16:52
                            6 NEW                   27-SEP-18 16:53
                            7 NEW                   27-SEP-18 16:54
                            8 NEW                   27-SEP-18 16:55
                            9 NEW                   27-SEP-18 16:56
                           10 NEW                   27-SEP-18 16:57

This is not what I expected. I wanted my second session to process the other rows, but here it seems that it has processed the same rows as the first one. What has been done by the NUMBER1 has been lost and overwritten by NUMBER2. This is inconsistent, violates ACID properties, and should not happen. An SQL statement must ensure write consistency: either by locking all the rows as soon as they are read (for non-MVCC databases where reads block writes), or re-starting the update when a mutating row is encountered. Oracle default behaviour is in the second case, NUMBER2 query reads the rows 1 to 5, because the changes by NUMBER1, not committed yet, are invisible from NUMBER2. But the execution should keep track of the columns referenced in the where clause. When attempting to update a row, now that the concurrent change is visible, the update is possible only if the WHERE clause used to select the rows still selects this row. If not, the database should raise an error (this is what happens in serializable isolation level) or re-start the update when in the default statement-level consistency.

Here, probably because of the nested subquery, the write consistency is not guaranteed and this is a bug.

One workaround is not to use subqueries. However, as we need to ORDER BY the rows in order to process the oldest first, we cannot avoid the subquery. The workaround for this is to add STATUS = ‘NEW’ in the WHERE clause of the update, so that the update restart works correctly.

However, the goal of multithreading those processes is to be scalable, and multiple update restarts may finally serialize all those updates.

The preferred solution for this is to ensure that the updates do not attempt to touch the same rows. This can be achieved by a SELECT … FOR UPDATE SKIP LOCKED. As this cannot be added directly to the update statement, we need a cursor. Something like this can do the job:


declare counter number:=5;
begin
 for c in (select /*+ first_rows(5) */ ID FROM DEMO 
           where STATUS = 'NEW' 
           order by CREATED
           for update skip locked)
 loop
  counter:=counter-1;
  update DEMO set NAME = 'NUMBER1', STATUS = 'HOLDING'  where ID = c.ID and STATUS = 'NEW';
  exit when counter=0;
 end loop;
end;
/
commit;

This can be optimized further but just gives an idea of what is needed for a scalable solution. Waiting for locks is not scalable.

Cet article Oracle write consistency bug and multi-thread de-queuing est apparu en premier sur Blog dbi services.

↧

Validate your SQL Server infrastructure with dbachecks

December 14, 2020, 9:12 am

≫ Next: Amazon Aurora: calling a lambda from a trigger

≪ Previous: Oracle write consistency bug and multi-thread de-queuing

Introduction

In this blog post, I’ll do an introduction to the PowerShell module dbachecks.
dbachecks uses Pester and dbatools to validate your SQL Server infrastructure.
With very minimal configuration you can check that your infrastructure is configured following standard best practices or your own policy.

We will see the following topics

– Prerequisites for dbachecks Installation
– Introduction to Pester
– Perform a Check
– Manage the Configuration items – Import & Export
– Output
– Power BI dashboard

Prerequisites for dbachecks Installation

The dbachecks module depends on the following modules:

dbatools
Pester
PSFramework

The easiest way to perform the installation is to do a simple Install-Module. It will get the latest dbachecks version from the PSGallery and install all the requires modules up to date.

I had many issues with this method.
The latest versions of PSFramework (1.4.150) did not seem to work with the current dbachecks version.
Installing the latest version of Pester (5.0.4) brings issues too.
When running a command I would get the following error:

Unable to find type [Pester.OutputTypes].
At line:219 char:9
+         [Pester.OutputTypes]$Show = 'All'
+         ~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (Pester.OutputTypes:TypeName) [], RuntimeException
    + FullyQualifiedErrorId : TypeNotFound

To avoid this, prior to installing dbachecks, you should first install PSFramework with version 1.1.59.
Pester is already shipped with the recent versions of Windows with version 3.4.
If want to get a newer version, install manually version 4. Issues seem to come with version 5.

Set-PSRepository -Name "PSGallery" -InstallationPolicy Trusted

Install-Module PSFramework -RequiredVersion 1.1.59
Install-Module Pester -RequiredVersion 4.10.1 -Force -SkipPublisherCheck
Install-Module dbachecks

Here is what I got working:

Pester

dbacheks relies heavily on Pester. Pester is a framework that brings functions to build a unit-test for PowerShell code.
If you have don’t know what is Pester I’d recommend you read my introduction to Pester post here.

dbatools

The checks performed by dbatools are based on dbatools functions. If you didn’t tried dbatools yet I’d recommend you to have a look at dbatools’ repository and try a few commands.

Perform a Check

Now let’s talk about dbachecks. It’s is basically a set of Pester tests for your SQL Server infrastructure with code relying heavily on dbatools module.
Let’s look at the list of available “Checks” from dbachecks with Get-DbcCheck.

As you can see, they are currently 134 checks available covering a wide range of configurations you might want to check.

Let’s run a Check on an SQL Server instance. To do so we use the Invoke-DbcCheck command with the Check UniqueTag and the target Instance name.

This one checks for the database owner for all user databases of the instance. The default value for this check is configured to “sa”.
My check returned everything green. There’s only one database on this instance and its database owner is “sa”.

Check multiple instances

They are many ways to run checks against multiple instances.
You can define a list of instances in the config parameter with the command below. I’ll come to configuration elements in a minute.

Set-DbcConfig -Name app.sqlinstance -Value "server1\InstA", "localhost", "server2\instA"

Here I will use a CMS and the dbatools command Get-DbaRegisteredServer to get my list of instances. On the other instance, one of the databases got a non-“sa” database owner.
Maybe this owner is a valid one and I want to have this check succeed. We can modify the check configuration.

Check Configuration elements

All checks can have configuration elements.
To search in the configuration elements you can use Get-DbcConfig. I want to change the database owner’s name, I can search for all config items with names like “owner”.

The configuration values are also available with Get-DbcConfigValue.

So now, with Set-DbcConfig I can add a valid database owner to the ValidDatabaseOwner check.

Here is the output of the same check run again:

Of course, multiple tests can be run at the same time, for example:

Manage the Configuration items – Import & Export

We have seen how to use Set-DbcConfig to modify your checks configuration. You don’t need to change those configurations one by one every time you want to check your infrastructure.
All configuration items can be exported to a JSON file and imported back again.

I can set the configuration items as needed and then do Export-DbcConfig specifying the destination file:

# LastFullBackup - Maximum number of days before Full Backups are considered outdated
Set-DbcConfig -Name policy.backup.fullmaxdays -Value 7

# Percent disk free
Set-DbcConfig -Name policy.diskspace.percentfree -Value 5

# The maximum percentage variance that the last run of a job is allowed over the average for that job
Set-DbcConfig -Name agent.lastjobruntime.percentage -Value 20
# The maximum percentage variance that a currently running job is allowed over the average for that job
Set-DbcConfig -Name agent.longrunningjob.percentage -Value 20

# Maximum job history log size (in rows). The value -1 means disabled
Set-DbcConfig -Name agent.history.maximumhistoryrows -Value 10000

# The maximum number of days to check for failed jobs
Set-DbcConfig -Name agent.failedjob.since -Value 8

# The number of days prior to check for error log issues - default 2
Set-DbcConfig -Name agent.failedjob.since -Value 3

Export-DbcConfig -Path "$($HOME)\Documents\WindowsPowerShell\MorningCheck-Qual.json"

Here is the output of the Export-DbcConfig:

As you can guess imports of Config files are done with Import-DbcConfig.

Import-DbcConfig -Path "$($HOME)\Documents\WindowsPowerShell\MorningCheck-Qual.json"

Output

The Show parameter

The dbachecks output in the console gives a great level of details on what is going on. When you have thousands of checks running you might not want to get this wall of green text.
To show only the Failed checks you can use the -Show parameter of Invoke-DbcCheck with the value “Fails”.

Invoke-DbcCheck -Check ValidDatabaseOwner -Show Fails

If you want even fewer details, you can use -Show Summary.

XML files

Tests results can also be saved to XML files using the OutputFile parameter like this:

Here is an output example:

<?xml version="1.0" encoding="utf-8" standalone="no"?>
<test-results xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="nunit_schema_2.5.xsd" name="Pester" total="2" errors="0" failures="1" not-run="0" inconclusive="0" ignored="0" skipped="0" invalid="0" date="2020-12-14" time="15:29:47">
  <environment clr-version="4.0.30319.42000" user-domain="win10vm4" cwd="C:\Users\win10vm4admin\Documents\WindowsPowerShell\Modules\dbachecks\2.0.7\checks" platform="Microsoft Windows 10 Pro|C:\WINDOWS|\Device\Harddisk0\Partition4" machine-name="win10vm4" nunit-version="2.5.8.0" os-version="10.0.18363" user="win10vm4admin" />
  <culture-info current-culture="en-US" current-uiculture="en-US" />
  <test-suite type="TestFixture" name="Pester" executed="True" result="Failure" success="False" time="0.3166" asserts="0" description="Pester">
    <results>
      <test-suite type="TestFixture" name="C:\Users\win10vm4admin\Documents\WindowsPowerShell\Modules\dbachecks\2.0.7\checks\Database.Tests.ps1" executed="True" result="Failure" success="False" time="0.3166" asserts="0" description="C:\Users\win10vm4admin\Documents\WindowsPowerShell\Modules\dbachecks\2.0.7\checks\Database.Tests.ps1">
        <results>
          <test-suite type="TestFixture" name="Valid Database Owner" executed="True" result="Failure" success="False" time="0.2048" asserts="0" description="Valid Database Owner">
            <results>
              <test-suite type="TestFixture" name="Testing Database Owners on localhost" executed="True" result="Failure" success="False" time="0.1651" asserts="0" description="Testing Database Owners on localhost">
                <results>
                  <test-case description="Database dbi_tools - owner sa should be in this list ( sa ) on win10vm4" name="Valid Database Owner.Testing Database Owners on localhost.Database dbi_tools - owner sa should be in this list ( sa ) on win10vm4" time="0.0022" asserts="0" success="True" result="Success" executed="True" />
                  <test-case description="Database testDB - owner win10vm4\win10vm4admin should be in this list ( sa ) on win10vm4" name="Valid Database Owner.Testing Database Owners on localhost.Database testDB - owner win10vm4\win10vm4admin should be in this list ( sa ) on win10vm4" time="0.0043" asserts="0" success="False" result="Failure" executed="True">
                    <failure>
                      <message>Expected collection sa to contain 'win10vm4\win10vm4admin', because The account that is the database owner is not what was expected, but it was not found.</message>
                      <stack-trace>at &lt;ScriptBlock&gt;, C:\Users\win10vm4admin\Documents\WindowsPowerShell\Modules\dbachecks\2.0.7\checks\Database.Tests.ps1: line 172
172:                         $psitem.Owner | Should -BeIn $TargetOwner -Because "The account that is the database owner is not what was expected"</stack-trace>
                    </failure>
                  </test-case>
                </results>
              </test-suite>
            </results>
          </test-suite>
        </results>
      </test-suite>
    </results>
  </test-suite>
</test-results>

These XML files can be used to automate reporting with the tool of your choice.

Excel export

There’s a way to export the results to Excel. If you want to try it I’d recommend you to read Jess Pomfret’s blog post dbachecks meets ImportExcel.

Power BI dashboard

Checks can be displayed in a beautiful PowerBI dashboard.

The Update-DbcPowerBiDataSource command converts results and exports files in the required format for launching the Power BI command Start-DbcPowerBI.

The Update-DbcPowerBiDataSource command can take an “Environnement” parameter which is useful to compare your environments.
Here is an example of how it can be used.

Import-DbcConfig -Path "$($HOME)\Documents\WindowsPowerShell\MorningCheck-Qual.json"
Invoke-DbcCheck -Check ValidDatabaseOwner, ErrorLogCount `
    -Show Summary -Passthru | Update-DbcPowerBiDataSource -Environment 'Qual'

Import-DbcConfig -Path "$($HOME)\Documents\WindowsPowerShell\MorningCheck-Prod-Listener.json"
Invoke-DbcCheck -Check ValidDatabaseOwner, ErrorLogCount `
    -Show Summary -Passthru | Update-DbcPowerBiDataSource -Environment 'Prod'

Start-DbcPowerBi

The dashboard.

Conclusion

From my experience, dbatools use amongst DBA has grown a lot recently. Likewise, I think dbacheck will be used more and more by DBAs in the years to come.
It’s easy to use and can save you save a lot of time for your Daily/Weekly SQL Server checks.

This blog post was just to get you started with dbachecks. Do not hesitate to comment if you have any questions.

Cet article Validate your SQL Server infrastructure with dbachecks est apparu en premier sur Blog dbi services.

↧

Amazon Aurora: calling a lambda from a trigger

December 14, 2020, 1:13 pm

≫ Next: Efficiently query DBA_EXTENTS for FILE_ID / BLOCK_ID

≪ Previous: Validate your SQL Server infrastructure with dbachecks

By Franck Pachot

.
You may want your RDS database to interact with other AWS services. For example, send a notification on a business or administration situation, with a “push” method rather than a “pull” one from a Cloud watch alert. You may even design this call to be triggered on database changes. And Amazon Aurora provides this possibility by running a lambda from the database through calling mysql.lambda_async() from a MySQL trigger. This is an interesting feature but I think that it is critical to understand how it works in order to use it correctly.
This is the kind of feature that looks very nice on a whiteboard or powerpoint: the DML event (like an update) runs a trigger that calls the lambda, all event-driven. However, this is also dangerous: are you sure that every update must execute this process? What about an update during an application release, or a dump import, or a logical replication target? Now imagine that you have a bug in your application that has set some wrong data and you have to fix it in emergency in the production database, under stress, with manual updates and aren’t aware of that trigger, or just forget about it in this situation… Do you want to take this risk? As the main idea is to run some external service, the consequence might be very visible and hard to fix, like spamming all your users, or involuntarily DDoS a third-tier application.

I highly encourage to encapsulate the DML and the call of lambda in a procedure that is clearly named and described. For example, let’s take a silly example: sending a “your address has been changed” message when a user updates his address. Don’t put the “send message” call in an AFTER UPDATE trigger. Because the UPDATE semantic is to update. Not to send a message. What you can do is write a stored procedure like UPDATE_ADDRESS() that will do the UPDATE, and call the “send message” lambda. You may even provide a boolean parameter to enable or not the sending of the message. Then, the ones who call the stored procedure know what will happen. And the one who just do an update,… will just do an update. Actually, executing DML directly from the application is often a mistake. A database should expose business-related data services, like many other components of your application architecture, and this is exactly the goal of stored procedures.

I’m sharing here some tests on calling lambda from Aurora MySQL.

Wiring the database to lambdas

A lambda is not a simple procedure that you embed in your program. It is a service and you have to control the access to it:

You create the lambda (create function, deploy and get the ARN)
You define an IAM policy to invoke this lambda
You define an IAM role to apply this policy
You set this role as aws_default_lambda_role in the RDS cluster parameter group
You add this role to the cluster (RDS -> database cluster -> Manage IAM roles)

Here is my lambda which just logs the event for my test:


import json

def lambda_handler(event, context):
    print('Hello.')
    print(event)
    return {
        'statusCode': 200,
        'body': json.dumps('Hello from Lambda!')
    }

Creating the test database


 drop database if exists demo;
 create database demo;
 use demo;
 drop table if exists t;
 create table t ( x int, y int );
 insert into t values ( 1, 1 );

I have a simple table here, with a simple row.


delimiter $$
create trigger t_bufer before update on t for each row
begin
 set NEW.y=OLD.x;
 call mysql.lambda_async(
    'arn:aws:lambda:eu-central-1:802756008554:function:log-me',
    concat('{"trigger":"t_bufer","connection":"',connection_id(),'","old": "',OLD.x,'","new":"',NEW.x,'"}'));
end;
$$
delimiter ;

This is my trigger which calls my lambda on an update with old and new value in the message.


MYSQL_PS1="Session 1 \R:\m:\s> " mysql -v -A --host=database-1.cluster-ce5fwv4akhjp.eu-central-1.rds.amazonaws.com --port=3306 --user=admin --password=ServerlessV2

I connect a first session , displaying the time and session in the prompt.


Session 1 23:11:55> use demo;
Database changed

Session 1 23:12:15> truncate table t;
--------------
truncate table t
--------------

Query OK, 0 rows affected (0.09 sec)

Session 1 23:12:29> insert into t values ( 1, 1 );
--------------
insert into t values ( 1, 1 )
--------------

Query OK, 1 row affected (0.08 sec)

this hust resets the testcase when I want to re-run it.


Session 1 23:12:36> start transaction;
--------------
start transaction
--------------

Query OK, 0 rows affected (0.07 sec)

Session 1 23:12:48> update t set x = 42;
--------------
update t set x = 42
--------------

Query OK, 1 row affected (0.11 sec)
Rows matched: 1  Changed: 1  Warnings: 0

Session 1 23:12:55> rollback;
--------------
rollback
--------------

Query OK, 0 rows affected (0.02 sec)

I updated one row, and rolled back my transaction. This is to show that you must be aware that calling a lambda is out of the ACID protection of relational databases. The trigger is executed during the update, without knowing if the transaction will be committed or not (voluntarily or because an exception is encountered). When you do only things in the database (like writing into another table) there is no problem because this happens within the transaction. If the transaction is rolled back, all the DML done by the triggers are rolled back as well. Even if they occurred, nobody sees their effect, except the current session, before the whole transaction is committed.

But when you call a lambda, synchronously or asynchronously, the call is executed and its effect will not be rolled back if your transaction does not complete. This can be ok in some cases, if what you execute is related to the intention of the update and not its completion. Or you must manage this exception in your lambda, maybe by checking in the database that the transaction occurred. But in that case, you should really question your architecture (a call to a service, calling back to the caller…)

So… be careful with that. If your lambda is there to be executed when a database operation has been done, it may have to be done after the commit, in the procedural code that has executed the transaction.

Another test…

This non-ACID execution was the important point I wanted to emphasize, so you can stop here if you want. This other test is interesting for people used to Oracle Database only, probably. In general, nothing guarantees that a trigger is executed only once for the triggering operation. What we have seen above (rollback) can be done internally when a serialization exception is encountered and the database can retry the operation. Oracle Database has non-blocking reads and this is not only for SELECT but also for the read phase of an UPDATE. You may have to read a lot of rows to verify the predicate and update only a few ones, and you don’t want to lock all the rows read but only the ones that are updated. Manually, you would do that with a serializable transaction and retry in case you encounter a rows that have been modified between your MVCC snapshot and the current update time. But at statement level, Oracle does that for you.

It seems that it does not happen in Aurora MySQL and PostgreSQL, as the locking for reads is more aggressive, but just in case I tested the same scenario where an update restart would have occurred in Oracle.


Session 1 23:13:00> start transaction;
--------------
start transaction
--------------

Query OK, 0 rows affected (0.06 sec)

Session 1 23:13:09> update t set x = x+1;
--------------
update t set x = x+1
--------------

Query OK, 1 row affected (0.02 sec)
Rows matched: 1  Changed: 1  Warnings: 0

Session 1 23:13:16> select * from t;
--------------
select * from t
--------------

+------+------+
| x    | y    |
+------+------+
|    2 |    1 |
+------+------+
1 row in set (0.01 sec)

Session 1 23:13:24>

I have started a transaction that increased the value of X, but the transaction is still open. What I do next is from another session.

This is session 2:


Session 2 23:13:32> use demo;

Database changed
Session 2 23:13:34>
Session 2 23:13:35> select * from t;
--------------
select * from t
--------------

+------+------+
| x    | y    |
+------+------+
|    1 |    1 |
+------+------+
1 row in set (0.01 sec)

Of course, thanks to transaction isolation, I do not see the uncommitted change.


Session 2 23:13:38> update t set x = x+1 where x > 0;
--------------
update t set x = x+1 where x > 0
--------------

At this step, the update hangs on the locked row.

Now back in the first session:


Session 1 23:13:49>
Session 1 23:13:50>
Session 1 23:13:50>
Session 1 23:13:50> commit;
--------------
commit
--------------

Query OK, 0 rows affected (0.02 sec)

I just commited my change here, so X has been increased to the value 2.

And here is what happened in my seconds session, with the lock released by the first session:


Query OK, 1 row affected (11.42 sec)
Rows matched: 1  Changed: 1  Warnings: 0

Session 2 23:13:58> commit;
--------------
commit
--------------

Query OK, 0 rows affected (0.01 sec)

Session 2 23:14:10> select * from t;
--------------
select * from t
--------------

+------+------+
| x    | y    |
+------+------+
|    3 |    2 |
+------+------+
1 row in set (0.01 sec)

Session 2 23:14:18>

This is the correct behavior. Even if a select sees the value of X=1 the update cannot be done until the first session has committed its transaction. This is why it waited, and it has read the committed value of X=2 which is then incremented to 3.

And finally here is what was logged by my lambda, as a screenshot and as text:


2020-12-13T23:12:55.558+01:00	START RequestId: 39e4e41f-7853-4b11-a12d-4a3147be3fc7 Version: $LATEST	2020/12/13/[$LATEST]25e73c8c6f9e4d168fa29b9ad2ba76d8
2020-12-13T23:12:55.561+01:00	Hello.	2020/12/13/[$LATEST]25e73c8c6f9e4d168fa29b9ad2ba76d8
2020-12-13T23:12:55.561+01:00	{'trigger': 't_bufer', 'connection': '124', 'old': '1', 'new': '42'}	2020/12/13/[$LATEST]25e73c8c6f9e4d168fa29b9ad2ba76d8
2020-12-13T23:12:55.562+01:00	END RequestId: 39e4e41f-7853-4b11-a12d-4a3147be3fc7	2020/12/13/[$LATEST]25e73c8c6f9e4d168fa29b9ad2ba76d8
2020-12-13T23:12:55.562+01:00	REPORT RequestId: 39e4e41f-7853-4b11-a12d-4a3147be3fc7 Duration: 1.16 ms Billed Duration: 2 ms Memory Size: 128 MB Max Memory Used: 51 MB	2020/12/13/[$LATEST]25e73c8c6f9e4d168fa29b9ad2ba76d8
2020-12-13T23:13:16.620+01:00	START RequestId: 440128db-d6de-4b2c-aa98-d7bedf12a3d4 Version: $LATEST	2020/12/13/[$LATEST]25e73c8c6f9e4d168fa29b9ad2ba76d8
2020-12-13T23:13:16.624+01:00	Hello.	2020/12/13/[$LATEST]25e73c8c6f9e4d168fa29b9ad2ba76d8
2020-12-13T23:13:16.624+01:00	{'trigger': 't_bufer', 'connection': '124', 'old': '1', 'new': '2'}	2020/12/13/[$LATEST]25e73c8c6f9e4d168fa29b9ad2ba76d8
2020-12-13T23:13:16.624+01:00	END RequestId: 440128db-d6de-4b2c-aa98-d7bedf12a3d4	2020/12/13/[$LATEST]25e73c8c6f9e4d168fa29b9ad2ba76d8
2020-12-13T23:13:16.624+01:00	REPORT RequestId: 440128db-d6de-4b2c-aa98-d7bedf12a3d4 Duration: 1.24 ms Billed Duration: 2 ms Memory Size: 128 MB Max Memory Used: 51 MB	2020/12/13/[$LATEST]25e73c8c6f9e4d168fa29b9ad2ba76d8
2020-12-13T23:13:58.156+01:00	START RequestId: c50ceab7-6e75-4e43-b77d-26c1f6347fec Version: $LATEST	2020/12/13/[$LATEST]25e73c8c6f9e4d168fa29b9ad2ba76d8
2020-12-13T23:13:58.160+01:00	Hello.	2020/12/13/[$LATEST]25e73c8c6f9e4d168fa29b9ad2ba76d8
2020-12-13T23:13:58.160+01:00	{'trigger': 't_bufer', 'connection': '123', 'old': '2', 'new': '3'}	2020/12/13/[$LATEST]25e73c8c6f9e4d168fa29b9ad2ba76d8
2020-12-13T23:13:58.160+01:00	END RequestId: c50ceab7-6e75-4e43-b77d-26c1f6347fec	2020/12/13/[$LATEST]25e73c8c6f9e4d168fa29b9ad2ba76d8
2020-12-13T23:13:58.160+01:00	REPORT RequestId: c50ceab7-6e75-4e43-b77d-26c1f6347fec Duration: 0.91 ms Billed Duration: 1 ms Memory Size: 128 MB Max Memory Used: 51 MB	2020/12/13/[$LATEST]25e73c8c6f9e4d168fa29b9ad2ba76d8

First, we see at 23:12:55 the update from X=1 to X=42 that I rolled back later. This proves that the call to lambda is not transactional. It may sound obvious but if you come from Oracle Database you would have used Advanced Queuing where the queue is stored in a RDBMS table and then benefit from sharing the same transaction as the submitter.
My update occurred at 23:12:48 but remember that those calls are asynchronous so the log happens a bit later.

Then there was my second test where I updated, at 23:13:09, X from 1 to 2 and we see this update logged at 23:13:16 which is after the update, for the asynchronous reason, but before the commit which happened at 23:13:50 according to my session log above. Then no doubt that the execution of the lambda does not wait for the completion of the transaction that triggered it.

And then the update from the session 2 which was executed at 23:13:38 but returned at 23:13:50 as it was waiting for the first session to end its transaction. The lambda log at 23:13:58 shows it and shows that the old value is X=2 which is expected as the update was done after the first session change. This is where, in Oracle, we would have seen two entries: one updating from X=1, because this would have been read without lock, and then rolled back to restart the update after X=2. But we don’t have this problem here as MySQL acquires a row lock during the read phase.

However, nothing guarantees that there are no internal rollback + restart. And anyway, rollback can happen for many reasons and you should think, during design, whether the call to the lambda should occur for DML intention or DML completion. For example, if you use it for some event sourcing, you may accept the asynchronous delay, but you don’t want to receive an event that actually didn’t occur.

Cet article Amazon Aurora: calling a lambda from a trigger est apparu en premier sur Blog dbi services.

↧

Efficiently query DBA_EXTENTS for FILE_ID / BLOCK_ID

December 15, 2020, 12:15 am

≫ Next: NTP is not working for ODA new deployment (reimage) in version 19.8?

≪ Previous: Amazon Aurora: calling a lambda from a trigger

By Franck Pachot

.
This was initially posted to CERN Database blog on Thursday, 27 September 2018 where it seems to be lost. Here is a copy thanks to web.archive.org

Did you ever try to query DBA_EXTENTS on a very large database with LMT tablespaces? I had to, in the past, in order to find which segment a corrupt block belonged to. The information about extent allocation is stored in the datafiles headers, visible though X$KTFBUE, and queries on it can be very expensive. In addition to that, the optimizer tends to start with the segments and get to this X$KTFBUE for each of them. At this time, I had quickly created a view on the internal dictionary tables, forcing to start by X$KTFBUE with materialized CTE, to replace DBA_EXTENTS. I published this on dba-village in 2006.

I recently wanted to know the segment/extend for a hot block, identified by its file_id and block_id on a 900TB database with 7000 datafiles and 90000 extents, so I went back to this old query and I got my result in 1 second. The idea is to be sure that we start with the file (X$KCCFE) and then get to the extent allocation (X$KTFBUE) before going to the segments:

So here is the query:


column owner format a6
column segment_type format a20
column segment_name format a15
column partition_name format a15
set linesize 200
set timing on time on echo on autotrace on stat
WITH
 l AS ( /* LMT extents indexed on ktfbuesegtsn,ktfbuesegfno,ktfbuesegbno */
  SELECT ktfbuesegtsn segtsn,ktfbuesegfno segrfn,ktfbuesegbno segbid, ktfbuefno extrfn,
         ktfbuebno fstbid,ktfbuebno + ktfbueblks - 1 lstbid,ktfbueblks extblks,ktfbueextno extno
  FROM sys.x$ktfbue
 ),
 d AS ( /* DMT extents ts#, segfile#, segblock# */
  SELECT ts# segtsn,segfile# segrfn,segblock# segbid, file# extrfn,
         block# fstbid,block# + length - 1 lstbid,length extblks, ext# extno
  FROM sys.uet$
 ),
 s AS ( /* segment information for the tablespace that contains afn file */
  SELECT /*+ materialized */
  f1.fenum afn,f1.ferfn rfn,s.ts# segtsn,s.FILE# segrfn,s.BLOCK# segbid ,s.TYPE# segtype,f2.fenum segafn,t.name tsname,blocksize
  FROM sys.seg$ s, sys.ts$ t, sys.x$kccfe f1,sys.x$kccfe f2 
  WHERE s.ts#=t.ts# AND t.ts#=f1.fetsn AND s.FILE#=f2.ferfn AND s.ts#=f2.fetsn
 ),
 m AS ( /* extent mapping for the tablespace that contains afn file */
SELECT /*+ use_nl(e) ordered */
 s.afn,s.segtsn,s.segrfn,s.segbid,extrfn,fstbid,lstbid,extblks,extno, segtype,s.rfn, tsname,blocksize
 FROM s,l e
 WHERE e.segtsn=s.segtsn AND e.segrfn=s.segrfn AND e.segbid=s.segbid
 UNION ALL
 SELECT /*+ use_nl(e) ordered */ 
 s.afn,s.segtsn,s.segrfn,s.segbid,extrfn,fstbid,lstbid,extblks,extno, segtype,s.rfn, tsname,blocksize
 FROM s,d e
  WHERE e.segtsn=s.segtsn AND e.segrfn=s.segrfn AND e.segbid=s.segbid
 UNION ALL
 SELECT /*+ use_nl(e) use_nl(t) ordered */
 f.fenum afn,null segtsn,null segrfn,null segbid,f.ferfn extrfn,e.ktfbfebno fstbid,e.ktfbfebno+e.ktfbfeblks-1 lstbid,e.ktfbfeblks extblks,null extno, null segtype,f.ferfn rfn,name tsname,blocksize
 FROM sys.x$kccfe f,sys.x$ktfbfe e,sys.ts$ t
 WHERE t.ts#=f.fetsn and e.ktfbfetsn=f.fetsn and e.ktfbfefno=f.ferfn
 UNION ALL
 SELECT /*+ use_nl(e) use_nl(t) ordered */
 f.fenum afn,null segtsn,null segrfn,null segbid,f.ferfn extrfn,e.block# fstbid,e.block#+e.length-1 lstbid,e.length extblks,null extno, null segtype,f.ferfn rfn,name tsname,blocksize
 FROM sys.x$kccfe f,sys.fet$ e,sys.ts$ t
 WHERE t.ts#=f.fetsn and e.ts#=f.fetsn and e.file#=f.ferfn
 ),
 o AS (
  SELECT s.tablespace_id segtsn,s.relative_fno segrfn,s.header_block   segbid,s.segment_type,s.owner,s.segment_name,s.partition_name
  FROM SYS_DBA_SEGS s
 ),
datafile_map as (
SELECT
 afn file_id,fstbid block_id,extblks blocks,nvl(segment_type,decode(segtype,null,'free space','type='||segtype)) segment_type,
 owner,segment_name,partition_name,extno extent_id,extblks*blocksize bytes,
 tsname tablespace_name,rfn relative_fno,m.segtsn,m.segrfn,m.segbid
 FROM m,o WHERE extrfn=rfn and m.segtsn=o.segtsn(+) AND m.segrfn=o.segrfn(+) AND m.segbid=o.segbid(+)
UNION ALL
SELECT
 file_id+(select to_number(value) from v$parameter WHERE name='db_files') file_id,
 1 block_id,blocks,'tempfile' segment_type,
 '' owner,file_name segment_name,'' partition_name,0 extent_id,bytes,
  tablespace_name,relative_fno,0 segtsn,0 segrfn,0 segbid
 FROM dba_temp_files
)
select * from datafile_map where file_id=5495 and 11970455 between block_id and block_id+blocks

And here is the result, with execution statistics:



   FILE_ID   BLOCK_ID     BLOCKS SEGMENT_TYPE         OWNER  SEGMENT_NAME    PARTITION_NAME    EXTENT_ID      BYTES TABLESPACE_NAME      RELATIVE_FNO     SEGTSN     SEGRFN    SEGBID
---------- ---------- ---------- -------------------- ------ --------------- ---------------- ---------- ---------- -------------------- ------------ ---------- ---------- ----------
      5495   11964544            8192 INDEX PARTITION LHCLOG DN_PK           PART_DN_20161022 1342         67108864 LOG_DATA_20161022            1024       6364       1024        162

Elapsed: 00:00:01.25

Statistics
----------------------------------------------------------
        103  recursive calls
       1071  db block gets
      21685  consistent gets
        782  physical reads
        840  redo size
       1548  bytes sent via SQL*Net to client
        520  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

Knowing the segment from the block address is important in performance tuning, when we get the file_id/block_id from wait event parameters. It is even more important when a block corrution is detected ans having a fast query may help.

Cet article Efficiently query DBA_EXTENTS for FILE_ID / BLOCK_ID est apparu en premier sur Blog dbi services.

↧

NTP is not working for ODA new deployment (reimage) in version 19.8?

December 18, 2020, 8:46 am

≫ Next: Oracle SPD status on two learning paths

≪ Previous: Efficiently query DBA_EXTENTS for FILE_ID / BLOCK_ID

Having recently reimaged and patched several ODA in version 19.8 and 19.9, I could see an issue with NTP. During my troubleshooting I could determine the root cause and find appropriate solution. Through this blog I would like to share my experience with you.

Symptom/Analysis

ODA version 19.6 or higher is coming with Oracle Linux 7. Since Oracle Linux 7 the default synchronization service is not ntp any more but chrony. In Oracle Linux 7, ntp is still available and can still be used. But ntp service will disappear in Oracle Linux 8.

What I could realize from my last deployments and patching is that :

Patching your ODA to version 19.8 or 19.9 from 19.6 : The system will still use ntpd and chronyd service will be deactivated. All is working fine.
You reimage your ODA to version 19.8 : chronyd will be activated and NTP will not work any more.
You reimage your ODA to version 19.9 : ntpd will be activated and NTP will be working with no problem.

So the problem is only if you reimage your ODA to version 19.8.

Problem explanation

The problem is due to the fact that the odacli script deploying the appliance will still update the ntpd configuration (/etc/ntpd.conf) with the IP address provided and not chronyd. But chronyd will be, by default, activated and started then with no configuration.

Solving the problem

There is 2 solutions.

A/ Configure and use chronyd

You configure /etc/chrony.conf with the NTP addresses given during appliance creation and you restart chronyd service.

Configure chrony :

oracle@ODA01:/u01/app/oracle/local/dmk/etc/ [rdbms19.8.0.0] vi /etc/chrony.conf

oracle@ODA01:/u01/app/oracle/local/dmk/etc/ [rdbms19.8.0.0] cat /etc/chrony.conf
# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
#server 0.pool.ntp.org iburst
#server 1.pool.ntp.org iburst
#server 2.pool.ntp.org iburst
#server 3.pool.ntp.org iburst
server 212.X.X.X.103 prefer
server 212.X.X.X.100
server 212.X.X.X.101


# Record the rate at which the system clock gains/losses time.
driftfile /var/lib/chrony/drift

# Allow the system clock to be stepped in the first three updates
# if its offset is larger than 1 second.
makestep 1.0 3

# Enable kernel synchronization of the real-time clock (RTC).
rtcsync

# Enable hardware timestamping on all interfaces that support it.
#hwtimestamp *

# Increase the minimum number of selectable sources required to adjust
# the system clock.
#minsources 2

# Allow NTP client access from local network.
#allow 192.168.0.0/16

# Serve time even if not synchronized to a time source.
#local stratum 10

# Specify file containing keys for NTP authentication.
#keyfile /etc/chrony.keys

# Specify directory for log files.
logdir /var/log/chrony

# Select which information is logged.
#log measurements statistics tracking

And you restart chrony service :

[root@ODA01 ~]# service chronyd restart
Redirecting to /bin/systemctl restart chronyd.service

B/ Start ntp

Starting ntp will automatically stop chrony service.

[root@ODA01 ~]# ntpq -p
ntpq: read: Connection refused

[root@ODA01 ~]# service ntpd restart
Redirecting to /bin/systemctl restart ntpd.service

Checking synchronization :

[root@ODA01 ~]# ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
lantime. domain_name .STEP.          16 u    - 1024    0    0.000    0.000   0.000
*ntp1. domain_name    131.188.3.223    2 u  929 1024  377    0.935   -0.053   0.914
+ntp2. domain_name    131.188.3.223    2 u  113 1024  377    0.766    0.184   2.779

Checking both ntp and chrony services :

[root@ODA01 ~]# service ntpd status
Redirecting to /bin/systemctl status ntpd.service
● ntpd.service - Network Time Service
   Loaded: loaded (/usr/lib/systemd/system/ntpd.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2020-11-27 09:40:08 CET; 31min ago
  Process: 68548 ExecStart=/usr/sbin/ntpd -u ntp:ntp $OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 68549 (ntpd)
    Tasks: 1
   CGroup: /system.slice/ntpd.service
           └─68549 /usr/sbin/ntpd -u ntp:ntp -g

Nov 27 09:40:08 ODA01 ntpd[68549]: ntp_io: estimated max descriptors: 1024, initial socket boundary: 16
Nov 27 09:40:08 ODA01 ntpd[68549]: Listen and drop on 0 v4wildcard 0.0.0.0 UDP 123
Nov 27 09:40:08 ODA01 ntpd[68549]: Listen normally on 1 lo 127.0.0.1 UDP 123
Nov 27 09:40:08 ODA01 ntpd[68549]: Listen normally on 2 btbond1 10.X.X.10 UDP 123
Nov 27 09:40:08 ODA01 ntpd[68549]: Listen normally on 3 priv0 192.X.X.24 UDP 123
Nov 27 09:40:08 ODA01 ntpd[68549]: Listen normally on 4 virbr0 192.X.X.1 UDP 123
Nov 27 09:40:08 ODA01 ntpd[68549]: Listening on routing socket on fd #21 for interface updates
Nov 27 09:40:08 ODA01 ntpd[68549]: 0.0.0.0 c016 06 restart
Nov 27 09:40:08 ODA01 ntpd[68549]: 0.0.0.0 c012 02 freq_set kernel 0.000 PPM
Nov 27 09:40:08 ODA01 ntpd[68549]: 0.0.0.0 c011 01 freq_not_set

[root@ODA01 ~]# service chronyd status
Redirecting to /bin/systemctl status chronyd.service
● chronyd.service - NTP client/server
   Loaded: loaded (/usr/lib/systemd/system/chronyd.service; enabled; vendor preset: enabled)
   Active: inactive (dead) since Fri 2020-11-27 09:40:08 CET; 32min ago
     Docs: man:chronyd(8)
           man:chrony.conf(5)
  Process: 46183 ExecStartPost=/usr/libexec/chrony-helper update-daemon (code=exited, status=0/SUCCESS)
  Process: 46180 ExecStart=/usr/sbin/chronyd $OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 46182 (code=exited, status=0/SUCCESS)

Nov 27 09:18:25 ODA01 systemd[1]: Starting NTP client/server...
Nov 27 09:18:25 ODA01 chronyd[46182]: chronyd version 3.4 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +SCFILTER +SIGND +ASYNCDNS +SECHASH +IPV6 +DEBUG)
Nov 27 09:18:25 ODA01 chronyd[46182]: Frequency 0.000 +/- 1000000.000 ppm read from /var/lib/chrony/drift
Nov 27 09:18:25 ODA01 systemd[1]: Started NTP client/server.
Nov 27 09:40:08 ODA01 systemd[1]: Stopping NTP client/server...
Nov 27 09:40:08 ODA01 systemd[1]: Stopped NTP client/server.

You might need to deactivate chronyd service with systemctl to avoid chronyd starting automatically after server reboot.

Are you getting a socket error with chrony?

If you are getting following error starting chrony, you will need to give appropriate option to start chronyd with IPv4 :

Nov 27 09:09:19 ODA01 chronyd[35107]: Could not open IPv6 command socket : Address family not supported by protocol.

Example of error encountered :

[root@ODA01 ~]# service chronyd status
Redirecting to /bin/systemctl status chronyd.service
● chronyd.service - NTP client/server
   Loaded: loaded (/usr/lib/systemd/system/chronyd.service; enabled; vendor preset: enabled)
   Active: active (running) since Fri 2020-11-27 09:09:19 CET; 5min ago
     Docs: man:chronyd(8)
           man:chrony.conf(5)
  Process: 35109 ExecStartPost=/usr/libexec/chrony-helper update-daemon (code=exited, status=0/SUCCESS)
  Process: 35105 ExecStart=/usr/sbin/chronyd $OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 35107 (chronyd)
    Tasks: 1
   CGroup: /system.slice/chronyd.service
           └─35107 /usr/sbin/chronyd

Nov 27 09:09:19 ODA01 systemd[1]: Starting NTP client/server...
Nov 27 09:09:19 ODA01 chronyd[35107]: chronyd version 3.4 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +SCFILTER +SIGND +ASYNCDNS +SECHASH +IPV6 +DEBUG)
Nov 27 09:09:19 ODA01 chronyd[35107]: Could not open IPv6 command socket : Address family not supported by protocol
Nov 27 09:09:19 ODA01 chronyd[35107]: Frequency 0.000 +/- 1000000.000 ppm read from /var/lib/chrony/drift
Nov 27 09:09:19 ODA01 systemd[1]: Started NTP client/server.

Chronyd system service is using a variable to set options :

[root@ODA01 ~]# cat /usr/lib/systemd/system/chronyd.service
[Unit]
Description=NTP client/server
Documentation=man:chronyd(8) man:chrony.conf(5)
After=ntpdate.service sntp.service ntpd.service
Conflicts=ntpd.service systemd-timesyncd.service
ConditionCapability=CAP_SYS_TIME

[Service]
Type=forking
PIDFile=/var/run/chrony/chronyd.pid
EnvironmentFile=-/etc/sysconfig/chronyd
ExecStart=/usr/sbin/chronyd $OPTIONS
ExecStartPost=/usr/libexec/chrony-helper update-daemon
PrivateTmp=yes
ProtectHome=yes
ProtectSystem=full

[Install]
WantedBy=multi-user.target

Need to put options -4 to chronyd service configuration file :

[root@ODA01 ~]# cat /etc/sysconfig/chronyd
# Command-line options for chronyd
OPTIONS=""

[root@ODA01 ~]# vi /etc/sysconfig/chronyd

[root@ODA01 ~]# cat /etc/sysconfig/chronyd
# Command-line options for chronyd
OPTIONS="-4"

You will just need to restart chrony service :

[root@ODA01 ~]# service chronyd restart
Redirecting to /bin/systemctl restart chronyd.service

[root@ODA01 ~]# service chronyd status
Redirecting to /bin/systemctl status chronyd.service
● chronyd.service - NTP client/server
   Loaded: loaded (/usr/lib/systemd/system/chronyd.service; enabled; vendor preset: enabled)
   Active: active (running) since Fri 2020-11-27 09:18:25 CET; 4s ago
     Docs: man:chronyd(8)
           man:chrony.conf(5)
  Process: 46183 ExecStartPost=/usr/libexec/chrony-helper update-daemon (code=exited, status=0/SUCCESS)
  Process: 46180 ExecStart=/usr/sbin/chronyd $OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 46182 (chronyd)
    Tasks: 1
   CGroup: /system.slice/chronyd.service
           └─46182 /usr/sbin/chronyd -4

Nov 27 09:18:25 ODA01 systemd[1]: Starting NTP client/server...
Nov 27 09:18:25 ODA01 chronyd[46182]: chronyd version 3.4 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +SCFILTER +SIGND +ASYNCDNS +SECHASH +IPV6 +DEBUG)
Nov 27 09:18:25 ODA01 chronyd[46182]: Frequency 0.000 +/- 1000000.000 ppm read from /var/lib/chrony/drift
Nov 27 09:18:25 ODA01 systemd[1]: Started NTP client/server.

Finally you can then use following command to check NTP synchronisation with chronyd :

[root@ODA01 ~]# chronyc tracking

Cet article NTP is not working for ODA new deployment (reimage) in version 19.8? est apparu en premier sur Blog dbi services.

↧

Oracle SPD status on two learning paths

December 19, 2020, 8:57 am

≫ Next: DynamoDB Scan: the most efficient operation 😉

≪ Previous: NTP is not working for ODA new deployment (reimage) in version 19.8?

By Franck Pachot

.
I have written a lot about SQL Plan Directives that appeared in 12c. They were used by default and, because of some side effects at the time of 12cR1 with legacy applications that were parsing too much, they have been disabled by default in 12cR2. Today, there are probably not used enough because of their bad reputation from those times. But for datawarehouses, they should be the default in my opinion.

There is a behaviour that surprised me initially and I though it was a bug but, after 5 years, the verdict is: expected behaviour (Bug 20311655 : SQL PLAN DIRECTIVE INVALIDATED BY STATISTICS FEEDBACK). The name of the bug is my fault: I initially though that the statistics feedback had been wrongly interpreted as HAS_STATS. But actually, this behaviour has nothing to do with it: it was visible here only because the re-optimization had triggered a new hard parse, which has changed the state. But any other query on similar predicates would have done the same.

And this is what I’m showing here: when the misestimate cannot be solved by extended statistics, the learning path of SQL Plan Directive have to go through this HAS_STATS state where misestimate will occur again. I’m mentioning the fact that extended statistics can help or not, and this is anticipated by the optimizer. For this reason, I’ve run two sets of examples: one with a predicate where no column group can help, and one where extended statistics can be created.

SQL> show parameter optimizer_adaptive
NAME                              TYPE    VALUE 
--------------------------------- ------- ----- 
optimizer_adaptive_plans boolean TRUE 
optimizer_adaptive_reporting_only boolean FALSE 
optimizer_adaptive_statistics boolean TRUE

Since 12.2 the adaptive statistics are disabled by default: SQL Plan Directives are created but not used. This is fine for OLTP databases that are upgraded from previous versions. However, for data warehouse, analytic, ad-hoc queries, reporting, enabling adaptive statistics may help a lot when the static statistics are not sufficient to optimize complex queries.

SQL> alter session set optimizer_adaptive_statistics=true;

Session altered.

I’m enabling adaptive statistics for my session.

SQL> exec for r in (select directive_id from dba_sql_plan_dir_objects where owner=user) loop begin dbms_spd.drop_sql_plan_directive(r.directive_id); exception when others then raise; end; end loop;

I’m removing all SQL Plan Directives in my lab to build a reproducible test case.

SQL> create table DEMO pctfree 99 as select mod(rownum,2) a,mod(rownum,2) b,mod(rownum,2) c,mod(rownum,2) d from dual connect by level <=1000;

Table DEMO created.

This is my test table. Build on purpose with a special distribution of data: all rows with 0 or 1 on all columns.

SQL> alter session set statistics_level=all;

Session altered.

I’m profiling down to execution plan operation in order to see all execution statistics

SPD learning path {E}:
USABLE(NEW)->SUPERSEDED(HAS_STATS)->USABLE(PERMANENT)

SQL> select count(*) c1 from demo where a+b+c+d=0;

    C1 
______ 
   500

Here is a query where dynamic sampling can help to get better statistics on selectivity but where no static statistic can help even on column group (extended statistics on expression is not considered for SQL Plan Directives even in 21c)

SQL> select * from dbms_xplan.display_cursor(format=>'allstats last');

                                                                                PLAN_TABLE_OUTPUT 
_________________________________________________________________________________________________ 
SQL_ID  fjcbm5x4014mg, child number 0                                                             
-------------------------------------                                                             
select count(*) c1 from demo where a+b+c+d=0                                                      
                                                                                                  
Plan hash value: 2180342005                                                                       
                                                                                                  
----------------------------------------------------------------------------------------------    
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |    
----------------------------------------------------------------------------------------------    
|   0 | SELECT STATEMENT   |      |      1 |        |      1 |00:00:00.03 |     253 |    250 |    
|   1 |  SORT AGGREGATE    |      |      1 |      1 |      1 |00:00:00.03 |     253 |    250 |    
|*  2 |   TABLE ACCESS FULL| DEMO |      1 |     10 |    500 |00:00:00.03 |     253 |    250 |    
----------------------------------------------------------------------------------------------    
                                                                                                  
Predicate Information (identified by operation id):                                               
---------------------------------------------------                                               
                                                                                                  
   2 - filter("A"+"B"+"C"+"D"=0)

As expected the estimation (10 rows) is far from the actual number of rows (500). This statement is flagged for re-optimisation with cardinality feedback but I’m interested by different SQL statements here.

SQL> exec dbms_spd.flush_sql_plan_directive;

PL/SQL procedure successfully completed.

SQL> select state, extract(notes,'/spd_note/internal_state/text()') internal_state, extract(notes,'/spd_note/spd_text/text()') spd_text from dba_sql_plan_directives where directive_id in (select directive_id from dba_sql_plan_dir_objects where owner=user) and type='DYNAMIC_SAMPLING';


    STATE    INTERNAL_STATE                      SPD_TEXT 
_________ _________________ _____________________________ 
USABLE    NEW               {E(DEMO.DEMO)[A, B, C, D]}

A SQL Plan Directive has been created to keep the information that equality predicates on columns A, B, C and D are misestimated. The directive is in internal state NEW. The visible state is USABLE which means that dynamic sampling will be used by queries with a similar predicate on those columns.

SQL> select count(*) c2 from demo where a+b+c+d=0;

    C2 
______ 
   500 

SQL> select * from dbms_xplan.display_cursor(format=>'allstats last');

                                                                       PLAN_TABLE_OUTPUT 
________________________________________________________________________________________ 
SQL_ID  5sg7b9jg6rj2k, child number 0                                                    
-------------------------------------                                                    
select count(*) c2 from demo where a+b+c+d=0                                             
                                                                                         
Plan hash value: 2180342005                                                              
                                                                                         
-------------------------------------------------------------------------------------    
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |    
-------------------------------------------------------------------------------------    
|   0 | SELECT STATEMENT   |      |      1 |        |      1 |00:00:00.01 |     253 |    
|   1 |  SORT AGGREGATE    |      |      1 |      1 |      1 |00:00:00.01 |     253 |    
|*  2 |   TABLE ACCESS FULL| DEMO |      1 |    500 |    500 |00:00:00.01 |     253 |    
-------------------------------------------------------------------------------------    
                                                                                         
Predicate Information (identified by operation id):                                      
---------------------------------------------------                                      
                                                                                         
   2 - filter("A"+"B"+"C"+"D"=0)                                                         
                                                                                         
Note                                                                                     
-----                                                                                    
   - dynamic statistics used: dynamic sampling (level=AUTO)                              
   - 1 Sql Plan Directive used for this statement

As expected, a different query (note that I changed the column alias C1 to C2 but anything can be different as long as there’s an equality predicate involving the same columns) has accurate estimations (E-Rows=A-Rows) because of dynamic sampling (dynamic statistics) thanks to the used SQL Plan Directive.

SQL> exec dbms_spd.flush_sql_plan_directive;

PL/SQL procedure successfully completed.

SQL> select state, extract(notes,'/spd_note/internal_state/text()') internal_state, extract(notes,'/spd_note/spd_text/text()') spd_text from dba_sql_plan_directives where directive_id in (select directive_id from dba_sql_plan_dir_objects where owner=user) and type='DYNAMIC_SAMPLING';

        STATE    INTERNAL_STATE                      SPD_TEXT 
_____________ _________________ _____________________________ 
SUPERSEDED    HAS_STATS         {E(DEMO.DEMO)[A, B, C, D]}

This is the important part and initially, I thought it was a bug because SUPERSEDED means that the next query on similar columns will not do dynamic sampling anymore, and then will have bad estimations. HAS_STATS does not mean that we have correct testimations here but only that there is no additional static statistics that can help. Because the optimizer has detected an expression (“A”+”B”+”C”+”D”=0) and automatic statistics extensions do not consider expressions.

SQL> select count(*) c3 from demo where a+b+c+d=0;

    C3 
______ 
   500 


SQL> select * from dbms_xplan.display_cursor(format=>'allstats last');

                                                                       PLAN_TABLE_OUTPUT 
________________________________________________________________________________________ 
SQL_ID  62cf5zwt4rwgj, child number 0                                                    
-------------------------------------                                                    
select count(*) c3 from demo where a+b+c+d=0                                             
                                                                                         
Plan hash value: 2180342005                                                              
                                                                                         
-------------------------------------------------------------------------------------    
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |    
-------------------------------------------------------------------------------------    
|   0 | SELECT STATEMENT   |      |      1 |        |      1 |00:00:00.01 |     253 |    
|   1 |  SORT AGGREGATE    |      |      1 |      1 |      1 |00:00:00.01 |     253 |    
|*  2 |   TABLE ACCESS FULL| DEMO |      1 |     10 |    500 |00:00:00.01 |     253 |    
-------------------------------------------------------------------------------------    
                                                                                         
Predicate Information (identified by operation id):                                      
---------------------------------------------------                                      
                                                                                         
   2 - filter("A"+"B"+"C"+"D"=0)

We are still in the learning phase and as you can see, even if we know that there is a misestimate (SPD has been created), adaptive statistic tries to avoid dynamic sampling: no SPD used mentioned in the notes, and back to the misestimate of E-Rows=10.

SQL> exec dbms_spd.flush_sql_plan_directive;

PL/SQL procedure successfully completed.

SQL> select state, extract(notes,'/spd_note/internal_state/text()') internal_state, extract(notes,'/spd_note/spd_text/text()') spd_text from dba_sql_plan_directives where directive_id in (select directive_id from dba_sql_plan_dir_objects where owner=user) and type='DYNAMIC_SAMPLING';

    STATE    INTERNAL_STATE                      SPD_TEXT 
_________ _________________ _____________________________ 
USABLE    PERMANENT         {E(DEMO.DEMO)[A, B, C, D]}

The HAS_STATS and the misestimate was temporary. Now that the optimizer has validated that with all possible static statistics available (HAS_STATS) we still have a misestimate, and then has passed the SPD status to PERMANENT: end of the learning phase, we will permanently do dynamic sampling for this kind of query.

SQL> select count(*) c4 from demo where a+b+c+d=0;

    C4 
______ 
   500 


SQL> select * from dbms_xplan.display_cursor(format=>'allstats last');

                                                                       PLAN_TABLE_OUTPUT 
________________________________________________________________________________________ 
SQL_ID  65ufgd70n61nh, child number 0                                                    
-------------------------------------                                                    
select count(*) c4 from demo where a+b+c+d=0                                             
                                                                                         
Plan hash value: 2180342005                                                              
                                                                                         
-------------------------------------------------------------------------------------    
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |    
-------------------------------------------------------------------------------------    
|   0 | SELECT STATEMENT   |      |      1 |        |      1 |00:00:00.01 |     253 |    
|   1 |  SORT AGGREGATE    |      |      1 |      1 |      1 |00:00:00.01 |     253 |    
|*  2 |   TABLE ACCESS FULL| DEMO |      1 |    500 |    500 |00:00:00.01 |     253 |    
-------------------------------------------------------------------------------------    
                                                                                         
Predicate Information (identified by operation id):                                      
---------------------------------------------------                                      
                                                                                         
   2 - filter("A"+"B"+"C"+"D"=0)                                                         
                                                                                         
Note                                                                                     
-----                                                                                    
   - dynamic statistics used: dynamic sampling (level=AUTO)                              
   - 1 Sql Plan Directive used for this statement

Yes, it has an overhead at hard parse time, but that helps to get better estimations and then faster execution plans. The execution plan shows that dynamic sampling is done because id SPD usage.

SPD learning path {EC}:
USABLE(NEW)->USABLE(MISSING_STATS)->SUPERSEDED(HAS_STATS)

I’m now running a query where the misestimate can be avoided with additional statistics: column group statistics extension.

SQL> select count(*) c1 from demo where a=0 and b=0 and c=0 and d=0;

    C1 
______ 
   500 

SQL> select * from dbms_xplan.display_cursor(format=>'allstats last');

                                                                       PLAN_TABLE_OUTPUT 
________________________________________________________________________________________ 
SQL_ID  2x5j71630ua0z, child number 0                                                    
-------------------------------------                                                    
select count(*) c1 from demo where a=0 and b=0 and c=0 and d=0                           
                                                                                         
Plan hash value: 2180342005                                                              
                                                                                         
-------------------------------------------------------------------------------------    
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |    
-------------------------------------------------------------------------------------    
|   0 | SELECT STATEMENT   |      |      1 |        |      1 |00:00:00.01 |     253 |    
|   1 |  SORT AGGREGATE    |      |      1 |      1 |      1 |00:00:00.01 |     253 |    
|*  2 |   TABLE ACCESS FULL| DEMO |      1 |     63 |    500 |00:00:00.01 |     253 |    
-------------------------------------------------------------------------------------    
                                                                                         
Predicate Information (identified by operation id):                                      
---------------------------------------------------                                      
                                                                                         
   2 - filter(("A"=0 AND "B"=0 AND "C"=0 AND "D"=0))

I have a misestimate here (E-Rows much lower than E-Rows) because the optimizer doesn’t know the correlation between A,B,C and D.

SQL> exec dbms_spd.flush_sql_plan_directive;

PL/SQL procedure successfully completed.

SQL> select state, extract(notes,'/spd_note/internal_state/text()') internal_state, extract(notes,'/spd_note/spd_text/text()') spd_text from dba_sql_plan_directives where directive_id in (select directive_id from dba_sql_plan_dir_objects where owner=user) and type='DYNAMIC_SAMPLING';


    STATE    INTERNAL_STATE                       SPD_TEXT 
_________ _________________ ______________________________ 
USABLE    PERMANENT         {E(DEMO.DEMO)[A, B, C, D]}     
USABLE    NEW               {EC(DEMO.DEMO)[A, B, C, D]}

I have now a new SQL Plan Directive and the difference with the previous one is that the equality predicate (E) is a simple column equality on each column (EC). From that, the optimizer knows that statistics extension on column group may help.

SQL> select count(*) c2 from demo where a=0 and b=0 and c=0 and d=0;

    C2 
______ 
   500 

SQL> select * from dbms_xplan.display_cursor(format=>'allstats last');


                                                                       PLAN_TABLE_OUTPUT 
________________________________________________________________________________________ 
SQL_ID  5sg8p03mmx7ca, child number 0                                                    
-------------------------------------                                                    
select count(*) c2 from demo where a=0 and b=0 and c=0 and d=0                           
                                                                                         
Plan hash value: 2180342005                                                              
                                                                                         
-------------------------------------------------------------------------------------    
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |    
-------------------------------------------------------------------------------------    
|   0 | SELECT STATEMENT   |      |      1 |        |      1 |00:00:00.01 |     253 |    
|   1 |  SORT AGGREGATE    |      |      1 |      1 |      1 |00:00:00.01 |     253 |    
|*  2 |   TABLE ACCESS FULL| DEMO |      1 |    500 |    500 |00:00:00.01 |     253 |    
-------------------------------------------------------------------------------------    
                                                                                         
Predicate Information (identified by operation id):                                      
---------------------------------------------------                                      
                                                                                         
   2 - filter(("A"=0 AND "B"=0 AND "C"=0 AND "D"=0))                                     
                                                                                         
Note                                                                                     
-----                                                                                    
   - dynamic statistics used: dynamic sampling (level=AUTO)                              
   - 1 Sql Plan Directive used for this statement

So, the NEW directive is a USABLE state: SPD is used to do some dynamic sampling, as it was with the previous example.

SQL> exec dbms_spd.flush_sql_plan_directive;

PL/SQL procedure successfully completed.

SQL> select state, extract(notes,'/spd_note/internal_state/text()') internal_state, extract(notes,'/spd_note/spd_text/text()') spd_text from dba_sql_plan_directives where directive_id in (select directive_id from dba_sql_plan_dir_objects where owner=user) and type='DYNAMIC_SAMPLING';

    STATE    INTERNAL_STATE                       SPD_TEXT 
_________ _________________ ______________________________ 
USABLE    PERMANENT         {E(DEMO.DEMO)[A, B, C, D]}     
USABLE    MISSING_STATS     {EC(DEMO.DEMO)[A, B, C, D]}

Here we have an additional state during the learning phase because there’s something else that can be done: we are not in HAS_STATS because more stats can be gathered. We are in MISSING_STATS internal state. This is a USABLE state so that dynamic sampling continues until we gather statistics.

SQL> select count(*) c3 from demo where a=0 and b=0 and c=0 and d=0;

    C3 
______ 
   500 

SQL> select * from dbms_xplan.display_cursor(format=>'allstats last');

                                                                       PLAN_TABLE_OUTPUT 
________________________________________________________________________________________ 
SQL_ID  d8zyzh140xk0d, child number 0                                                    
-------------------------------------                                                    
select count(*) c3 from demo where a=0 and b=0 and c=0 and d=0                           
                                                                                         
Plan hash value: 2180342005                                                              
                                                                                         
-------------------------------------------------------------------------------------    
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |    
-------------------------------------------------------------------------------------    
|   0 | SELECT STATEMENT   |      |      1 |        |      1 |00:00:00.01 |     253 |    
|   1 |  SORT AGGREGATE    |      |      1 |      1 |      1 |00:00:00.01 |     253 |    
|*  2 |   TABLE ACCESS FULL| DEMO |      1 |    500 |    500 |00:00:00.01 |     253 |    
-------------------------------------------------------------------------------------    
                                                                                         
Predicate Information (identified by operation id):                                      
---------------------------------------------------                                      
                                                                                         
   2 - filter(("A"=0 AND "B"=0 AND "C"=0 AND "D"=0))                                     
                                                                                         
Note                                                                                     
-----                                                                                    
   - dynamic statistics used: dynamic sampling (level=AUTO)                              
   - 1 Sql Plan Directive used for this statement

That can continue for a long time, with SPD in USABLE state and dynamic sampling compensating the missing stats, but at the cost of additional work during hard parse time.

SQL> exec dbms_spd.flush_sql_plan_directive;

PL/SQL procedure successfully completed.

SQL> select created,state, extract(notes,'/spd_note/internal_state/text()') internal_state, extract(notes,'/spd_note/spd_text/text()') spd_text from dba_sql_plan_directives where directive_id in (select directive_id from dba_sql_plan_dir_objects where owner=user) and type='DYNAMIC_SAMPLING' order by last_used;

    CREATED     STATE    INTERNAL_STATE                       SPD_TEXT 
___________ _________ _________________ ______________________________ 
20:52:11    USABLE    PERMANENT         {E(DEMO.DEMO)[A, B, C, D]}     
20:52:16    USABLE    MISSING_STATS     {EC(DEMO.DEMO)[A, B, C, D]}

The status will not change until statistics gathering occurs.

SQL> exec dbms_stats.set_table_prefs(user,'DEMO','AUTO_STAT_EXTENSIONS','on');

PL/SQL procedure successfully completed.

In the same idea as adaptive statistics not enabled by default, the automatic creation of statistics extension is not there by default. I enable it for this table only here, but, as many dbms_stats operations, you can do that at schema, database or global level. This is what I do here. Usually, you do it initially when creating the table, or simply at database level because it works in pair with adaptive statistics, but in this demo I waited to show that even if the decision of going to HAS_STATS or MISSING_STATS state depends on the possibility of extended stats creation, this is done without looking at the dbms_stats preference.

SQL> exec dbms_stats.gather_table_stats(user,'DEMO', options=>'gather auto');

PL/SQL procedure successfully completed.

Note that I’m gathering the statistics like the automatic job does: GATHER AUTO. As I did not change any rows, the table statistics are not stale but the new directive in MISSING_STATS tells DBMS_STATS that there’s a reason to re-gather the statistics.

And if you look at statistics extensions there, there’s a new statistics extension on (A,B,C,D) column group.Just look at USER_STAT_EXTENSIONS.

SQL> select count(*) c4 from demo where a=0 and b=0 and c=0 and d=0;

    C4 
______ 
   500 

SQL> select * from dbms_xplan.display_cursor(format=>'allstats last');

                                                                       PLAN_TABLE_OUTPUT 
________________________________________________________________________________________ 
SQL_ID  g08m3qrmw7mgn, child number 0                                                    
-------------------------------------                                                    
select count(*) c4 from demo where a=0 and b=0 and c=0 and d=0                           
                                                                                         
Plan hash value: 2180342005                                                              
                                                                                         
-------------------------------------------------------------------------------------    
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |    
-------------------------------------------------------------------------------------    
|   0 | SELECT STATEMENT   |      |      1 |        |      1 |00:00:00.01 |     253 |    
|   1 |  SORT AGGREGATE    |      |      1 |      1 |      1 |00:00:00.01 |     253 |    
|*  2 |   TABLE ACCESS FULL| DEMO |      1 |    500 |    500 |00:00:00.01 |     253 |    
-------------------------------------------------------------------------------------    
                                                                                         
Predicate Information (identified by operation id):                                      
---------------------------------------------------                                      
                                                                                         
   2 - filter(("A"=0 AND "B"=0 AND "C"=0 AND "D"=0))                                     
                                                                                         
Note                                                                                     
-----                                                                                    
   - dynamic statistics used: dynamic sampling (level=AUTO)                              
   - 1 Sql Plan Directive used for this statement

You may think that no dynamic sampling is needed anymore but the Adaptive Statistics mechanism is still in the learning phase: the SPD is still USABLE and the next parse will verify if MISSING_STATS can be superseded by HAS_STATS. This is what happened here.

SQL> exec dbms_spd.flush_sql_plan_directive;

PL/SQL procedure successfully completed.

SQL> select created,state, extract(notes,'/spd_note/internal_state/text()') internal_state, extract(notes,'/spd_note/spd_text/text()') spd_text from dba_sql_plan_directives where directive_id in (select directive_id from dba_sql_plan_dir_objects where owner=user) and type='DYNAMIC_SAMPLING' order by last_used;

    CREATED         STATE    INTERNAL_STATE                       SPD_TEXT 
___________ _____________ _________________ ______________________________ 
20:52:11    USABLE        PERMANENT         {E(DEMO.DEMO)[A, B, C, D]}     
20:52:16    SUPERSEDED    HAS_STATS         {EC(DEMO.DEMO)[A, B, C, D]}

Here, SUPERSEDED means no more dynamic sampling for predicates with simple column equality on A,B,C,D because it HAS_STATS.

In the past, I mean before 12c, I often recommended enabling dynamic sampling (with optimizer_dynamic_sampling >= 4) on datawarehouses, or sessions running complex ad-hoc queries for reporting. And no dynamic sampling, creating manual statistics extensions only when required, for OLTP where we can expect less complex queries and where hard parse time may be a problem.

Now, in the same idea, I’ll rather recommend setting adaptive statistics because it has a finer grain optimization. As we see here: only one kind of predicate does dynamic sampling, and this dynamic sampling is the “adaptive” one, estimating not only single table cardinality but joins and aggregations as well. This is the USABLE (PERMANENT) one. The other, did it only temporarily until statistics extensions were automatically created, SUPERSEDED with HAS_STATS.

In summary, MISSING_STATS state is seen only when, given the simple column equality, there are possible statistics that are missing. And HAS_STATS means that all the statistics that can be used by optimizer for this predicate are available and no more can be gathered. Each directive will go through HAS_STATS during the learning phase. And then, it stays in HAS_STATS or switches definitely to PERMANENT state when HAS_STAT encountered misestimate again.

Cet article Oracle SPD status on two learning paths est apparu en premier sur Blog dbi services.

↧

DynamoDB Scan: the most efficient operation 😉

December 20, 2020, 7:46 am

≫ Next: SQL Server TCP: Having both Dynamic Ports and Static Port configured

≪ Previous: Oracle SPD status on two learning paths

By Franck Pachot

.
The title is provocative on purpose because you can read in many places that you should avoid scans, and that Scan operations are less efficient than other operations in DynamoDB. I think that there is a risk, reading those message without understanding what is behind, that people will actually avoid Scans and replace them by something that is even worse. If you want to compare the efficiency of an operation, you must compare it when doing the same thing, or it is an Apple vs. Orange comparison. Here I’ll compare with two extreme use cases: the need to get all items, and the need to get one item only. And then I’ll explain further what is behind the “avoid scans” idea.

I have created a table with 5000 items:


aws dynamodb create-table --table-name Demo \
 --attribute-definitions AttributeName=K,AttributeType=N \
 --key-schema            AttributeName=K,KeyType=HASH \
 --billing-mode PROVISIONED --provisioned-throughput ReadCapacityUnits=25,WriteCapacityUnits=25

for i in {1..5000} ; do
aws dynamodb put-item     --table-name Demo --item '{"K":{"N":"'${i}'"},"V":{"S":"'"$RANDOM"'"}}'
done

Because each time I demo on a small table I have people commenting with “this proves nothing, the table is too small” I have to precise that you don’t need petabytes to understand how it scales. Especially with DynamoDB which is designed to scale linearly: there is no magic that will happen after reaching a threshold, like you can have in RDBMS (small scans optimized with cache, large scans optimized with storage index / zone maps). If you have doubts, you can run the same and change 5000 by 5000000000 and you will observe the same, but you do that on your own cloud bill, not mine

Let’s count the items:


[opc@a DynamoDBLocal]$ aws dynamodb scan --table-name Demo --select=COUNT --return-consumed-capacity TOTAL --output text

5000    None    5000
CONSUMEDCAPACITY        6.0     Demo

This is a Scan operation. The consumed capacity is 6 RCU. Is this good or bad? Efficient or not?
First, let’s understand those 6 RCU. I have 5000 items, their size is a bit less than 10 bytes (2 attributes with name in one character, number up to 5 digits). This is about 48 KiloBytes, read with eventual consistency (we don’t read all mirrors) where reading 4 KiloBytes costs 0.5 RCU. The maths is easy: 48 / 4 / 2 = 6. If you test it on 5000 millions of items as I suggested for those who don’t believe in small test cases, you will see 6 million RCU. It is just elementary arithmetic, cross-multiply and you get it, there’s no magic. So, if you provisioned the maximum on-demand RCU, which I think is 40000 RCU/Second by default, you can count those 5000 million items in two minutes and a half. Is that inefficient? Try parallel scans…

Scan

You see where I’m coming. There’s no operation to “avoid” or ban. It just depends on what you want to do. Counting all items is done with a scan and you cannot do faster in DynamoDB. Except if you maintain a global counter, but then you will double the cost of each putItem. You don’t make it faster, you just transfer the cost to another part of the application.

You may want to do something more complex than a count. This is a scan that sums the values of the attribute “V”:


[opc@a DynamoDBLocal]$ aws dynamodb scan --table-name Demo --select=SPECIFIC_ATTRIBUTES --projection-expression=V --return-consumed-capacity TOTAL --output text \
| awk '/^CONSUMEDCAPACITY/{rcu=rcu+$2}/^V/{sum=sum+$2;cnt=cnt+1}END{printf "%10.2f rcu   %10d items %10d sum(V)\n",rcu,cnt,sum}'

      6.00 rcu         5000 items   81599797 sum(V)

The code handles pagination (not needed here as my table is less than 1MB, but for people trying on 5000 million items, they can copy/paste this). I’ve described scan pagination in a previous post so you understand why I use the “text” output here. No surprise, a Scan is a Scan and there’s no cache in DynamoDB to make it faster when you read the same data frequently: 6 RCU again.

GetItem

Then, what will happen if you tell your developers that they must avoid scans? The table design is already there, and they need to get the count and the sum. This is not a critical use-case, maybe just to display it in a daily dashboard, so there’s no point to add the overhead of maintaining counters, with a lambda or AWS Glue Elastic Views. A Scan is perfectly valid here. But they try to avoid this “inefficient scan” and then come with this idea: they know the last item number inserted (5000 in my demo) and then use the “efficient” getItem call:


[opc@a DynamoDBLocal]$ for i in {1..5000} ; do  aws dynamodb get-item --table-name Demo --key '{"K":{"N":"'$i'"}}' --return-consumed-capacity TOTAL ; done \
| awk '/^CONSUMEDCAPACITY/{rcu=rcu+$2}/^V/{sum=sum+$2;cnt=cnt+1}END{printf "%10.2f rcu   %10d items %10d sum(V)\n",rcu,cnt,sum}'

   2500.00 rcu         5000 items   81599797 sum(V)

No surprises if you know how it works: each getItem costs 0.5 RU and then the total is 2500 RCU. Most of the time, you get to read the same block of data from the storage, but this still counts as RCU. This is 416 times more expensive than the scan. So, let’s refine the “scan is the least efficient operation” claim by:

Scan is the worst efficient operation to get one item
Scan is the most efficient operation to get many items

Size

What means “many” here? As I did here, getting all items is where scan is the most efficient. But given what we know in my example, as getItem costs 0.5 RCU per item and a Scan costs 6 RCU, we can say that Scan is the most efficient operation when getting more than 12 items. However, this depends on two things. First, depending on which predicate filters those 12 items, a Query may be faster than Scan. This depends on the data model and it is not the case with my table here. Second, this factor of 12 depends on the size of the items. Because:

The Scan operation depends on the size of the table (all items with all attributes) and not on the number of items read
The GetItem operation depends on the number of items reads (and their size when larger than 4KB)

In my example, I have small items (10 bytes) and then a Scan cat get more than 400 items per 0.5 RCU. Where GetItem can get at most 1 item per RCU. With this, the Scan is quickly more efficient than GetItem. And this does not depend on the size of the table, but the size of each items. This is important because the best practice documentation also says “you should avoid using a Scan operation on a large table or index with a filter that removes many results” . If we take the “avoid” as absolute, this is true, but it can also apply to any operation: avoid to read your data and everything will be faster and cheaper If we take “avoid” as using another access type, like GetItem, then this is wrong: the table size does not count in the efficiency. This claim is right only when this “filter that removes many results” is an equality predicate on the partition key. But at the time the developer reads this, the table design is done and it is too late. In NoSQL, you don’t have the agility to change the partitioning key without huge refactoring of the code, because you don’t have the RDBMS logical data independence. The best you can do for this use-case is a Scan and, maybe, cache it with your application code, or a DAX service, if it occurs too frequently.

All this is not new for SQL people. This myth of “full table scans are evil” is very old. Then, people realized that a full table scan may be the most efficient, especially with all optimization that happened in the last decades (hash joins, re-fetching, direct-path reads, storage indexes, adaptive plans,…). Please never say that something is inefficient without the context, of you will miss the best of it. When a “best practice” is spread without the context, it becomes a myth. DynamoDB has the advantage to be simple (limited access paths, no cost-based optimizer,…), and then it is easy to understand the cost of an access path rather than apply some best practices blindly.

How do you measure efficiency? When you look at the number of items you can get with one RCU, a Scan is actually the most efficient. And, please, don’t think that we should “avoid” scans as if another operation can be more efficient. What we should avoid with DynamoDB is a data model that requires scans for critical operations. Remember that it is a key-value datastore: optimized to get one item with GetItem (or one collection with Query) for one hash key value. When you need to read many items, it is still efficient with an appropriate composite key defined for that, like in the Single Table Design where one Query can retrieve with one RCU all items to be joined, or with Global Secondary Index as it is a replica with a different partitioning schema. But as soon as you read from all partitions, a Scan is the most efficient operation.

Cet article DynamoDB Scan: the most efficient operation 😉 est apparu en premier sur Blog dbi services.

↧

SQL Server TCP: Having both Dynamic Ports and Static Port configured

December 21, 2020, 9:43 am

≫ Next: Running two Patroni on one host using an existing etcd

≪ Previous: DynamoDB Scan: the most efficient operation 😉

Introduction

Have you ever seen an SQL Server instance configured to listen on both “TCP Dynamic Ports” and “TCP (static) Port”?

This kind of configuration can be caused by the following scenario:

A named instance is installed. By default, it is configured to use dynamic ports.
Someone wants to configure the instance to listen to a fixed port and set the “TCP Port” value
The “TCP Dynamic Ports” is set to value “0” thinking this would disable the dynamics ports

The documentation states that a value of “0” is actually enabling “TCP Dynamic Ports”.

If the TCP Dynamic Ports dialog box contains 0, indicating the Database Engine is listening on dynamic ports

After a service restart, SQL Server will listen to a port like 50119 for example.
You end up with the following configuration.

So what’s happening to SQL Server with this configuration?

What SQL Server is listening on?

Well, I could not find anything related to this particular case in the Microsoft documentation.
If we look at the SQL Server Error Log we can see that the instance is listening on both ports: the dynamically chosen one and the static port.

We can confirm this by trying a connection using SSMS:

What TCP port is used by clients connections?

But, are both ports actually used by client connections to the server?
From SQL we can see the established connections and their TCP port using this query:

select distinct local_tcp_port
from sys.dm_exec_connections
where net_transport = 'TCP'

This could also be seen with netstat:

Looking at this information I see no connection at all using the dynamically assigned port.
Only the static port is used.

SQL Browser

My guess is that the SQL Server Browser is giving priority to the static Port and always return this port to clients. I didn’t find any information online about this behavior but it makes sense.

When a client wants to connect to an instance using “server\instancename” an exchange is done with the server using the SQL Server Resolution Protocol using UDP. This is why you should enable UDP port 1434 in your Firewall if you need the SQL Browser.
For details about this protocol, you can read the specifications here.

Doing some tests with Wireshark and a UDP filter we can see the client asking about “inst1”, my instance name.

The server response contains some information about the instance with the most important one, the TCP Port, here the static port: 15001.

Conclusion

I think this configuration should be avoided because it doesn’t seem to add any benefits and could bring some confusion.
If you use a static TCP port for your instance, leave the “Dynamic TCP Port” blank.

Cet article SQL Server TCP: Having both Dynamic Ports and Static Port configured est apparu en premier sur Blog dbi services.

↧

Running two Patroni on one host using an existing etcd

December 23, 2020, 5:00 am

≫ Next: Cluster level encryption for PostgreSQL 14

≪ Previous: SQL Server TCP: Having both Dynamic Ports and Static Port configured

Have you ever asked yourself, how to create a second Patroni PostgreSQL cluster on an existing server using the existing etcd? My first idea was to study the documentation of Patroni, but unfortunately without big success. This post should help to identify the changes you have to do on the hosts to run two parallel Patroni clusters using an existing etcd.

Starting Point

First we want to have a short look on the existing infrastructure to have a better overview where we are starting.

There is a Patroni installation and etcd already existing on the servers. As well as one PostgreSQL cluster streaming from primary to replica. We are not using two replicas in this example, but it works the same for numerous replicas.

etcd

As the etcd is already running on the hosts, let’s start with this one. And here we already have good news! You don’t have to change anything on etcd side. Just leave your configuration as it is.

postgres@postgres_primary:/home/postgres/.local/bin/ [PGTEST] cat /u01/app/postgres/local/dmk/dmk_postgres/etc/etcd.conf
name: postgres-primary
data-dir: /u02/postgres/pgdata/etcd
initial-advertise-peer-urls: http://192.168.22.33:2380
listen-peer-urls: http://192.168.22.33:2380
listen-client-urls: http://192.168.22.33:2379,http://localhost:2379
advertise-client-urls: http://192.168.22.33:2379
initial-cluster: postgres-primary=http://192.168.22.33:2380, postgres-stby=http://192.168.22.34:2380, postgres-stby2=http://192.168.22.35:2380

patroni.yml

Let’s go on with the patroni.yml. As there is already a Patroni running on that server you need to create another patroni.yml, let’s say patroni_pgtest.yml. To keep it simple and not reinventing the wheel, just copy your existing yml file

postgres@postgres_primary:/home/postgres/ [PGTEST] cd /u01/app/postgres/local/dmk/dmk_postgres/etc
postgres@postgres_primary:/u01/app/postgres/local/dmk/dmk_postgres/etc/ [PGTEST] cp patroni.yml patroni_pgtest.yml

Once we have the new patroni_pgtest.yml we need to adjust some entries in this file. Most important entries to change are “namespace” and “scope”. Without changing this, your new Patroni service won’t create a new PostgreSQL cluster

scope: PGTEST
namespace: /pgtest/
name: pgtest1

Next parameters to change are the restapi ones. You can keep the IP address, but you have to adjust the port. Otherwise the service will start with an: “Address already in use” error.

restapi:
  listen: 192.168.22.33:8009
  connect_address: 192.168.22.33:8009

Once this is done, of course the PostgreSQL parameters need to be adjusted to not use the same port and clustername as the already existing cluster. Further the PGDATA directory needs to be adjusted.

...
...
  dcs:
    ttl: 30
    loop_wait: 10
    retry_timeout: 10
    maximum_lag_on_failover: 1048576
    postgresql:
      use_pg_rewind: true
      use_slots: true
      parameters:
        ...
        ...
        port: 5410
...
...
postgresql:
  listen: 192.168.22.33:5410
  connect_address: 192.168.22.33:5410
  data_dir: /u02/postgres/pgdata/13/PGTEST/
...
...

Patroni service

Now that we changed all our parameters, we can create a second Patroni service named patroni_pgtest.service. Be sure to point to the correct patroni_pgtest.yml

postgres@postgres_primary:/home/postgres/ [PGTEST] sudo vi /etc/systemd/system/patroni_pgtest.service
#
# systemd integration for patroni
#

[Unit]
Description=dbi services patroni service
After=etcd.service syslog.target network.target

[Service]
User=postgres
Type=simple
ExecStart=/u01/app/postgres/local/dmk/bin/patroni /u01/app/postgres/local/dmk/etc/patroni_pgtest.yml
ExecReload=/bin/kill -s HUP $MAINPID
KillMode=process
Restart=no

[Install]
WantedBy=multi-user.target

Now we can start and enable the service

postgres@postgres_primary:/home/postgres/ [PGTEST] sudo systemctl start patroni_pgtest.service
postgres@postgres_primary:/home/postgres/ [PGTEST] sudo systemctl status patroni_pgtest.service
● patroni_pgtest.service - dbi services patroni service
   Loaded: loaded (/etc/systemd/system/patroni_pgtest.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2020-12-22 20:07:46 CET; 9h ago
 Main PID: 4418 (patroni)
   CGroup: /system.slice/patroni.service
           ├─4418 /usr/bin/python2 /u01/app/postgres/local/dmk/dmk_postgres/bin/patroni /u01/app/postgres/local/dmk/dmk_postgres/etc/patroni_pgtest.yml
           ├─5258 /u01/app/postgres/product/PG13/db_1/bin/postgres -D /u02/pgdata/13/PG1/ --config-file=/u02/postgres/pgdata/13/PG1/postgresql.conf --listen_addresses=192.168.22.33 --max_worker_processes=8 --max_locks_per_tra...
           ├─5282 postgres: PG1: logger process
           ├─5292 postgres: PG1: checkpointer process
           ├─5294 postgres: PG1: writer process
           ├─5296 postgres: PG1: stats collector process
           ├─6171 postgres: PG1: postgres postgres 192.168.22.33(50492) idle
           ├─6473 postgres: PG1: wal writer process
           ├─6474 postgres: PG1: autovacuum launcher process

Dec 23 05:36:21 postgres_primary patroni[4418]: 2020-12-23 05:36:21,032 INFO: Lock owner: postgres_primary; I am postgres_primary
Dec 23 05:36:21 postgres_primary patroni[4418]: 2020-12-23 05:36:21,047 INFO: no action.  i am the leader with the lock

postgres@postgres_primary:/home/postgres/ [PGTEST] sudo systemctl enable patroni_pgtest.service

As the cluster is running on the primary server now, you can do the exactly same steps on your replica server(s). Be sure to set all ports and IPs correctly.

Conclusion

Even if it seems to be easy to setup a second Patroni on a server, it took some time to found out, what exactly needs to be changes. But once you know all that, it’s really simple. Just keep in mind that you have to use a port for your PostgreSQL cluster that is not used at the moment.
Furthermore if you are using our DMK on your host, be sure to use ‘patronictl list’ calling the correct configuration file and the complete path for patronictl. DMK gives you an alias for patronictl which will only work for the first Patroni cluster created on the server.

postgres@postgres_primary:/home/postgres/ [PGTEST] cd .local/bin
postgres@postgres_primary:/home/postgres/.local/bin [PGTEST] patronictl -c /u01/app/postgres/local/dmk/etc/patroni_pgtest.yml list
+------------+------------------+---------------+--------+---------+-----+-----------+
| Cluster    |     Member       |      Host     |  Role  |  State  |  TL | Lag in MB |
+------------+------------------+---------------+--------+---------+-----+-----------+
|   PGTEST   | postgres_primary | 192.168.22.33 | Leader | running | 528 |       0.0 |
|   PGTEST   | postgres_replica | 192.168.22.34 |        | running | 528 |       0.0 |
+------------+------------------+---------------+--------+---------+-----+-----------+

In case you’re not using DMK, you have to add the configuration file in any case. You also have to set the correct PATH variable or use the complete path to call patronictl.

Cet article Running two Patroni on one host using an existing etcd est apparu en premier sur Blog dbi services.

↧

Cluster level encryption for PostgreSQL 14

December 26, 2020, 7:33 am

≫ Next: Password rolling change before Oracle 21c

≪ Previous: Running two Patroni on one host using an existing etcd

The discussions how and why TDE (Transparent data encryption) should be implemented in PostgreSQL goes back several years. You can have a look at these two more recent threads to get an idea on how much discussion happened around that feature:

Finally an essentials part for that infrastructure was committed and I am sure, many people have waited for that to appear in plain community PostgreSQL. Lets have a quick look how it works and if it easy to play with.

To get an encrypted cluster you need to specify that when you initialize the cluster with initdb. One additional requirement is, that PostgreSQL was compiled with “–with-openssl”:

postgres@debian10pg:/home/postgres/ [pgdev] pg_config | grep openssl
CONFIGURE =  '--prefix=/u01/app/postgres/product/DEV/db_1/' '--exec-prefix=/u01/app/postgres/product/DEV/db_1/' '--bindir=/u01/app/postgres/product/DEV/db_1//bin' '--libdir=/u01/app/postgres/product/DEV/db_1//lib' '--sysconfdir=/u01/app/postgres/product/DEV/db_1//etc' '--includedir=/u01/app/postgres/product/DEV/db_1//include' '--datarootdir=/u01/app/postgres/product/DEV/db_1//share' '--datadir=/u01/app/postgres/product/DEV/db_1//share' '--with-pgport=5432' '--with-perl' '--with-python' '--with-openssl' '--with-pam' '--with-ldap' '--with-libxml' '--with-libxslt' '--with-segsize=2' '--with-blocksize=8' '--with-llvm' 'LLVM_CONFIG=/usr/bin/llvm-config-7' '--with-systemd'

If that is given you can initialize a new cluster and tell initdb how to get the encryption key:

postgres@debian10pg:/home/postgres/ [pgdev] initdb --help | grep cluster-key-command
  -c  --cluster-key-command=COMMAND

If this key is provided, two internal keys are generated, one for the table and index files (and any temporary objects) and one for the WAL files:

postgres@debian10pg:/home/postgres/ [pgdev] initdb --pgdata=/var/tmp/pgenc --cluster-key-command=/home/postgres/get_key.sh
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.
Cluster file encryption is enabled.

creating directory /var/tmp/pgenc ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Europe/Zurich
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

    pg_ctl -D /var/tmp/pgenc -l logfile start

The command to get the key in this example is quite trivial:

postgres@debian10pg:/home/postgres/ [pgdev] cat /home/postgres/get_key.sh
echo "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"

In a real setup the key should of course come from an external key store. Lets try to start the cluster:

postgres@debian10pg:/home/postgres/ [pgdev] export PGPORT=8888
postgres@debian10pg:/home/postgres/ [pgdev] pg_ctl -D /var/tmp/pgenc/ start
waiting for server to start....2020-12-26 16:11:12.220 CET [7106] LOG:  starting PostgreSQL 14devel on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit
2020-12-26 16:11:12.221 CET [7106] LOG:  listening on IPv6 address "::1", port 8888
2020-12-26 16:11:12.221 CET [7106] LOG:  listening on IPv4 address "127.0.0.1", port 8888
2020-12-26 16:11:12.234 CET [7106] LOG:  listening on Unix socket "/tmp/.s.PGSQL.8888"
2020-12-26 16:11:12.250 CET [7109] LOG:  database system was shut down at 2020-12-26 16:08:34 CET
2020-12-26 16:11:12.274 CET [7106] LOG:  database system is ready to accept connections
 done
server started

Why does that work? We did not provide the key at startup time so PostgreSQL somehow must know how to get the key. Actually there is a new parameter that automatically gets the command we specified when we initialized the cluster:

postgres@debian10pg:/home/postgres/ [pgdev] grep cluster_key /var/tmp/pgenc/postgresql.conf 
cluster_key_command = '/home/postgres/get_key.sh'

If we remove that and start again it will not work:

postgres@debian10pg:/home/postgres/ [pgdev] psql -c "alter system set cluster_key_command=''" postgres
ALTER SYSTEM
postgres@debian10pg:/home/postgres/ [pgdev] grep cluster_key /var/tmp/pgenc/postgresql.auto.conf 
cluster_key_command = ''
postgres@debian10pg:/home/postgres/ [pgdev] pg_ctl -D /var/tmp/pgenc/ stop
2020-12-26 16:15:29.457 CET [7106] LOG:  received fast shutdown request
waiting for server to shut down....2020-12-26 16:15:29.467 CET [7106] LOG:  aborting any active transactions
2020-12-26 16:15:29.469 CET [7106] LOG:  background worker "logical replication launcher" (PID 7115) exited with exit code 1
2020-12-26 16:15:29.473 CET [7110] LOG:  shutting down
2020-12-26 16:15:29.534 CET [7106] LOG:  database system is shut down
 done
server stopped
16:15:29 postgres@debian10pg:/home/postgres/ [pgdev] pg_ctl -D /var/tmp/pgenc/ start
waiting for server to start....2020-12-26 16:15:31.762 CET [7197] LOG:  starting PostgreSQL 14devel on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit
2020-12-26 16:15:31.763 CET [7197] LOG:  listening on IPv6 address "::1", port 8888
2020-12-26 16:15:31.763 CET [7197] LOG:  listening on IPv4 address "127.0.0.1", port 8888
2020-12-26 16:15:31.778 CET [7197] LOG:  listening on Unix socket "/tmp/.s.PGSQL.8888"
2020-12-26 16:15:31.786 CET [7197] FATAL:  cluster key must be 64 hexadecimal characters
2020-12-26 16:15:31.787 CET [7197] LOG:  database system is shut down
 stopped waiting
pg_ctl: could not start server
Examine the log output.

The two keys that have been generated when the cluster was initialized can be found in $PGDATA:

postgres@debian10pg:/var/tmp/pgenc/ [pgdev] ls -la pg_cryptokeys/live/
total 16
drwx------ 2 postgres postgres 4096 Dec 26 16:08 .
drwx------ 3 postgres postgres 4096 Dec 26 16:08 ..
-rw------- 1 postgres postgres   72 Dec 26 16:08 0
-rw------- 1 postgres postgres   72 Dec 26 16:08 1

The reason for two separate keys is, that a primary and a replica cluster can have a different key for the table, index and all other files generated during database operations but still can have the same key for the WAL files. Btw: pg_controldata will also tell you if a cluster is encrypted:

postgres@debian10pg:/var/tmp/pgenc/base/12833/ [pgdev] pg_controldata -D /var/tmp/pgenc/ | grep encr
File encryption key length:           128

That really is a nice and much appreciated feature. Currently only the whole cluster can be encrypted, but I am sure that is sufficient for most of the use cases. Lets hope that it will not get reverted for any reason.

Cet article Cluster level encryption for PostgreSQL 14 est apparu en premier sur Blog dbi services.

↧

Introduction

Before choosing ODAs

Before delivery

Delivery day

1st Day

2nd Day

3rd day

4th Day

5th Day

6th, 7th and 8th days

9th and 10th days

The next days

Conclusion

What is Jenkins?

Prerequisites

Install via MSI installer

Wizard installation

Unlock Jenkins

Choose plugin to install

Admin user creation

Select the Jenkins URL for login

Conclusion

Installing the module

Moving around in the repository

A few comments on the customizations

Security

Repository listing for /

Conclusion

By Franck Pachot

PostgreSQL

Point in Time Recovery

Failed

Second try

Disabling PITR

Recovery without point-in-time

About PITR and WAL size…

Summary

By Franck Pachot

Introduction

Prerequisites for dbachecks Installation

Pester

dbatools

Perform a Check

Check multiple instances

Check Configuration elements

Manage the Configuration items – Import & Export

Output

The Show parameter

XML files

Excel export

Power BI dashboard

Conclusion

By Franck Pachot

Wiring the database to lambdas

Creating the test database

Another test…

By Franck Pachot

Symptom/Analysis

Problem explanation

Solving the problem

A/ Configure and use chronyd

B/ Start ntp

Are you getting a socket error with chrony?

By Franck Pachot

SPD learning path {E}: USABLE(NEW)->SUPERSEDED(HAS_STATS)->USABLE(PERMANENT)

SPD learning path {EC}: USABLE(NEW)->USABLE(MISSING_STATS)->SUPERSEDED(HAS_STATS)

By Franck Pachot

Scan

GetItem

Size

Introduction

What SQL Server is listening on?

What TCP port is used by clients connections?

SQL Browser

Conclusion

Starting Point

etcd

patroni.yml

Patroni service

Conclusion

SPD learning path {E}:
USABLE(NEW)->SUPERSEDED(HAS_STATS)->USABLE(PERMANENT)

SPD learning path {EC}:
USABLE(NEW)->USABLE(MISSING_STATS)->SUPERSEDED(HAS_STATS)