Quantcast
Channel: dbi Blog
Viewing all 2851 articles
Browse latest View live

Kubernetes DNS resolution using CoreDNS (force update deployment)

$
0
0

After installing your Kubernetes cluster composed by masters and workers, a few configurations steps need to complete. In fact, the join command is not the last operation to perform, in order to have a fully operational cluster.
See how to deploy a k8s cluster using kubeadm here: https://blog.dbi-services.com/kubernetes-how-to-install-a-single-master-cluster-with-kubeadm/ .
One of the most important steps in the configuration is the name resolution (DNS) within the k8s cluster. In this blog post, we will see how to properly configure CoreDNS for the entire cluster.

Before beginning, it’s important to know that Kubernetes have 2 DNS versions: Kube-DNS and CoreDNS. Initially, the first versions of Kubernetes started with Kube-DNS and change to CoreDNS since version 1.10. For people who wants to know more about the comparison between both: https://coredns.io/2018/11/27/cluster-dns-coredns-vs-kube-dns/

Pre-requisites:

> You need to have a Kubernetes cluster with kubectl command-line tool configured
> Kubernetes version 1.6 and above
> At least a 3 nodes cluster (1 master and 2 workers)

Once the cluster is initialized and worker nodes have been joined, you can check the status of the nodes and list all pods of the kube-system namespace as follows:

[docker@docker-manager000 ~]$ kubectl get nodes -o wide
NAME                STATUS   ROLES    AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION               CONTAINER-RUNTIME
docker-manager000   Ready    master   55d   v1.15.3   10.36.0.10    <none>        CentOS Linux 7 (Core)   3.10.0-957.12.2.el7.x86_64   docker://18.9.6
docker-worker000    Ready    <none>   46d   v1.15.3   10.36.0.11    <none>        CentOS Linux 7 (Core)   3.10.0-957.10.1.el7.x86_64   docker://18.9.5
docker-worker001    Ready    <none>   46d   v1.15.3   10.36.0.12    <none>        CentOS Linux 7 (Core)   3.10.0-957.10.1.el7.x86_64   docker://18.9.5

According to the previous command, our cluster is composed of 3 nodes:
> docker-manager000
> docker-worker000
> docker-worker001
which means that pods will be scheduled across all the above hosts. So, each host should be able to resolve the service names with IP addresses. The CoreDNS pods enable this operation and need to be deployed in all hosts.

Let’s check the pod’s deployment in the kube-system namespace:

[docker@docker-manager000 ~]$ kubectl get pods -o wide -n kube-system
NAME                                        READY   STATUS    RESTARTS   AGE   IP              NODE                NOMINATED NODE   READINESS GATES
calico-kube-controllers-65b8787765-894gs    1/1     Running   16         55d   172.20.123.30   docker-manager000              
calico-node-5zhsp                           1/1     Running   6          46d   10.36.0.12      docker-worker001               
calico-node-gq5s9                           1/1     Running   8          46d   10.36.0.11      docker-worker000               
calico-node-pjrfm                           1/1     Running   16         55d   10.36.0.10      docker-manager000              
coredns-686f555694-mdsvd                    1/1     Running   6          35d   172.20.123.26   docker-manager000              
coredns-686f555694-w25wn                    1/1     Running   6          35d   172.20.123.28   docker-manager000              
etcd-docker-manager000                      1/1     Running   16         55d   10.36.0.10      docker-manager000              
kube-apiserver-docker-manager000            1/1     Running   0          13d   10.36.0.10      docker-manager000              
kube-controller-manager-docker-manager000   1/1     Running   46         55d   10.36.0.10      docker-manager000              
kube-proxy-gwkdh                            1/1     Running   7          46d   10.36.0.11      docker-worker000               
kube-proxy-lr5cf                            1/1     Running   6          46d   10.36.0.12      docker-worker001               
kube-proxy-mn7mt                            1/1     Running   16         55d   10.36.0.10      docker-manager000              
kube-scheduler-docker-manager000            1/1     Running   45         55d   10.36.0.10      docker-manager000

In more details, let’s verify the deployment of coredns pods:

[docker@docker-manager000 ~]$ kubectl get pods -o wide -n kube-system | grep coredns

coredns-686f555694-mdsvd                    1/1     Running   6          35d   172.20.123.26   docker-manager000              
coredns-686f555694-w25wn                    1/1     Running   6          35d   172.20.123.28   docker-manager000

Only 2 CoreDNS pods have been deployed within the same host: docker-manager000, our master node. The service name resolution will not work in our cluster for all pods. Let’s verify this supposition…

DNS Resolution test

Create a simple Pod to use for DNS testing:

[docker@docker-manager000 ~]$ cat > test-DNS.yaml << EOF
apiVersion: v1
kind: Pod
metadata:
  name: busybox
  namespace: default
spec:
  containers:
  - name: busybox
    image: busybox:1.28
    command:
      - sleep
      - "3600"
    imagePullPolicy: IfNotPresent
  restartPolicy: Always
EOF

[docker@docker-manager000 ~]$ kubectl apply -f test-DNS.yaml

Verify the status of the Pod previously deployed:

[docker@docker-manager000 ~]$ kubectl get pods -o wide
NAME      READY   STATUS    RESTARTS   AGE   IP              NODE               NOMINATED NODE   READINESS GATES
busybox   1/1     Running   0          13s   172.20.145.19   docker-worker000              

The Pod will be deployed on one of the worker nodes. In our case it’s docker-worker000.

Once the pod is running we can execute a nslookup command in order to verify if the DNS is working or not:

[docker@docker-manager000 ~]$ kubectl exec -it busybox -- nslookup kubernetes.default

Server:    172.21.0.10
Address 1: 172.21.0.10 kube-dns.kube-system.svc.cluster.local

nslookup: can't resolve 'kubernetes.default'

As supposed, the DNS is not working properly. The next steps will be to deploy the CoreDNS pods in all cluster nodes, docker-worker000, and docker-worker001 in our example.

CoreDNS update deployment

The first step is to update the CoreDNS deployment in order to increase the number of replicas, as following:

[docker@docker-manager000 ~]$ kubectl edit deployment coredns -n kube-system
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "13"
  creationTimestamp: "2019-08-28T07:36:28Z"
  generation: 14
  labels:
    k8s-app: kube-dns
  name: coredns
  namespace: kube-system
  resourceVersion: "6455829"
  selfLink: /apis/extensions/v1beta1/namespaces/kube-system/deployments/coredns
  uid: 3ebfd10f-c58b-43f4-84f1-a9f56dbdffdc
spec:
  progressDeadlineSeconds: 600
  replicas: 3
...

We updated the number of replica from 2 to 3. Then save the changes and wait a few seconds for new CoreDNS pod deployment within the kube-system namespace.

[docker@docker-manager000 ~]$ kubectl get pods -o wide -n kube-system | grep coredns
coredns-686f555694-k4678                    1/1     Running   10         36d   172.20.27.186   docker-worker001               
coredns-686f555694-mdsvd                    1/1     Running   6          36d   172.20.123.26   docker-manager000              
coredns-686f555694-w25wn                    1/1     Running   6          36d   172.20.123.28   docker-manager000              

At this step, the funniest happens because of the Kubernetes scheduler is considering that only 1 CoreDNS pod is needed in addition because 2 pods have been already created before. For this reason, only 1 CoreDNS pod has been deployed in the docker-worker001 randomly.

A workaround is to force update the CoreDNS deployment as follows:

[docker@docker-manager000 ~]$ wget https://raw.githubusercontent.com/zlabjp/kubernetes-scripts/master/force-update-deployment 
[docker@docker-manager000 ~]$ chmod +x force-update-deployment
[docker@docker-manager000 ~]$ ./force-update-deployment coredns -n kube-system

Check now the status of the CoreDNS pods:

[docker@docker-manager000 ~]$ kubectl get pods -o wide -n kube-system | grep coredns
coredns-7dc96b7db7-7ndwr                    1/1     Running   0          35s   172.20.145.36   docker-worker000               
coredns-7dc96b7db7-v7wjg                    1/1     Running   0          28s   172.20.123.27   docker-manager000              
coredns-7dc96b7db7-v9qcq                    1/1     Running   0          35s   172.20.27.181   docker-worker001               

The script should redeploy a CoreDNS pod on all hosts.

It may some times that the CoreDNS pods will not be redeployed in all hosts (some times 2 pods in the same host), in such case, execute again the script until 1 CoreDNS pod is deployed on each cluster nodes.

The DNS resolution should now work properly within the entire cluster. Let’s verify it by replaying the DNS resolution test.

Remove and redeploy the busybox deployment as follows:

[docker@docker-manager000 ~]$ kubectl delete -f test-DNS.yaml
pod "busybox" deleted
[docker@docker-manager000 ~]$ kubectl apply -f test-DNS.yaml
pod/busybox created
#Check pod status
[docker@docker-manager000 ~]$ kubectl get pods -o wide
NAME      READY   STATUS    RESTARTS   AGE   IP              NODE               NOMINATED NODE   READINESS GATES
busybox   1/1     Running   0          13s   172.20.145.19   docker-worker000              

Once the pod is running we can execute a nslookup command to confirm that the DNS is properly working:

[docker@docker-manager000 ~]$ kubectl exec -it busybox -- nslookup kubernetes.default

Server:    172.21.0.10
Address 1: 172.21.0.10 kube-dns.kube-system.svc.cluster.local

Name:      kubernetes.default
Address 1: 172.21.0.1 kubernetes.default.svc.cluster.local

Now our internal cluster DNS is working well 🙂 !!

Cet article Kubernetes DNS resolution using CoreDNS (force update deployment) est apparu en premier sur Blog dbi services.


Why you really should use peer authentication in PostgreSQL

$
0
0

It is always a bit of a surprise that many people do not know peer authentication in PostgreSQL. You might ask why that is important as initdb creates a default pg_hba.conf which does not allow any connections from outside the PostgreSQL server. While that is true there is another important point to consider.

Let’s assume you executed initdb without any options like this:

postgres@centos8pg:/home/postgres/ [pgdev] mkdir /var/tmp/test
postgres@centos8pg:/home/postgres/ [pgdev] initdb -D /var/tmp/test
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

fixing permissions on existing directory /var/tmp/test ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Europe/Zurich
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

pg_ctl -D /var/tmp/test -l logfile start

Did you ever notice the warning at the end of the output?

initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

You might think that this is not important as only the DBAs will have access to the operating system user postgres (or whatever user you used when you executed initdb). Although this might be true in your case, the server eventually might have other local users. Before creating a new user lets start the instance:

postgres@centos8pg:/home/postgres/ [pgdev] export PGPORT=9999
postgres@centos8pg:/home/postgres/ [pgdev] pg_ctl -D /var/tmp/test/ start -l /dev/null
waiting for server to start.... done
server started

You really need to be aware of is this:

postgres@centos8pg:/home/postgres/ [pgdev] sudo useradd test
postgres@centos8pg:/home/postgres/ [pgdev] sudo su - test
[test@centos8pg ~]$ /u01/app/postgres/product/DEV/db_1/bin/psql -p 9999 -U postgres postgres
psql (13devel)
Type "help" for help.

postgres=#

… and you are in as the superuser! So any local user can connect as the superuser by default. What you might want to do is this:

postgres@centos8pg:/home/postgres/ [pgdev] sudo chmod o-rwx /u01/app/postgres/product
postgres@centos8pg:/home/postgres/ [pgdev] sudo su - test
Last login: Tue Oct 22 21:19:58 CEST 2019 on pts/0
[test@centos8pg ~]$ /u01/app/postgres/product/DEV/db_1/bin/psql -p 9999 -U postgres postgres
-bash: /u01/app/postgres/product/DEV/db_1/bin/psql: Permission denied

This prevents all other users on the system from executing the psql binary. If you can guarantee that nobody installs psql in another way on the system that might be sufficient. As soon as psql is available somewhere on the system you’re lost again:

postgres@centos8pg:/home/postgres/ [pgdev] sudo dnf provides psql
Last metadata expiration check: 0:14:53 ago on Tue 22 Oct 2019 09:09:23 PM CEST.
postgresql-10.6-1.module_el8.0.0+15+f57f353b.x86_64 : PostgreSQL client programs
Repo        : AppStream
Matched from:
Filename    : /usr/bin/psql

postgres@centos8pg:/home/postgres/ [pgdev] sudo dnf install -y postgresql-10.6-1.module_el8.0.0+15+f57f353b.x86_64
[test@centos8pg ~]$ /usr/bin/psql -p 9999 -U postgres -h /tmp postgres
psql (10.6, server 13devel)
WARNING: psql major version 10, server major version 13.
Some psql features might not work.
Type "help" for help.

postgres=#

Not really an option. This is where peer authentication becomes very handy.

postgres@centos8pg:/home/postgres/ [pgdev] sed -i 's/local   all             all                                     trust/local   all             all                                     peer/g' /var/tmp/test/pg_hba.conf

Once you switched from trust to peer for local connections only the operating system user that created the instance will be able to connect locally without providing a password:

postgres@centos8pg:/home/postgres/ [pgdev] pg_ctl -D /var/tmp/test/ reload
server signaled
postgres@centos8pg:/home/postgres/ [pgdev] psql postgres
psql (13devel)
Type "help" for help.

[local]:9999 postgres@postgres=#

Other local users will not be able to connect anymore:

postgres@centos8pg:/home/postgres/ [pgdev] sudo su - testLast login: Tue Oct 22 21:25:36 CEST 2019 on pts/0
[test@centos8pg ~]$ /usr/bin/psql -p 9999 -U postgres -h /tmp postgres
psql: FATAL:  Peer authentication failed for user "postgres"
[test@centos8pg ~]$

So, please, consider enabling peer authentication or at least go for md5 for local connections as well.

Cet article Why you really should use peer authentication in PostgreSQL est apparu en premier sur Blog dbi services.

Connecting to ODA derby database

$
0
0

ODA light (ODA X7-2S, X7-2M, X8-2S, X8-2M) come with an internal derby database to manage ODA metadata. From time to time, there is a need to check or update some information within it, as for example when facing database deletion issue. I would like to strongly advise that updating manually the ODA repository should only be done after getting Oracle support guidance and agreement to do so. Neither the author (that’s me 🙂 ) nor dbi services 😉 would be responsible for any issue or consequence following commands described in this blog. This would be your own responsability. 😉
This blog is more intended to show how to connect to this internal derby database.

Performing a backup of the derby database

Breaking the derby database would damage the ODA, and having as consequence to reimage the ODA.

To backup the derby database, we need to stop the dcs agent.

[root@ODA03 derbyjar]# initctl stop initdcsagent
initdcsagent stop/waiting

[root@ODA03 derbyjar]# ps -ef | grep dcs-agent | grep -v grep
[root@ODA03 derbyjar]#

Then we can backup the repository :

[root@ODA03 derbyjar]# cd /opt/oracle/dcs/repo/

[root@ODA03 repo]# ls -l
total 4
drwxr-xr-x 4 root root 4096 Jun 12 14:05 node_0

[root@ODA03 repo]# cp -rp node_0 node_0_BKP_12.06.2019

Connecting to the derby database

To connect to the backup previously done, use the following connect string :

[root@ODA03 repo]# /usr/java/jdk1.8.0_161/db/bin/ij
ij version 10.11
ij> connect 'jdbc:derby:node_0_BKP_12.06.2019';

To connect to the running derby database use the following connect string (dcs agent needs to be stopped) :

[root@ODA03 repo]# /usr/java/jdk1.8.0_161/db/bin/ij
ij version 10.11
ij> connect 'jdbc:derby:node_0';

Running commands

The language used to interact with the derby database is common SQL language :

ij> select id,name,dbname,dbid,status,DBSTORAGE from db where dbname='test';
ID                                                                                                                              |NAME                                                                                                                            |DBNAME                                                                                                                          |DBID                                                                                                                            |STATUS                                                                                                                          |DBSTORAGE
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
c7db612e-f237-4491-9a30-ff2c2b75831f                                                                                            |test                                                                                                                            |test                                                                                                                            |032306496044                                                                                                                    |Deleting                                                                                                                        |Acfs

1 row selected
ij>

To list all the tables, you can run :

ij> show tables;
TABLE_SCHEM         |TABLE_NAME                    |REMARKS
------------------------------------------------------------------------
SYS                 |SYSALIASES                    |
SYS                 |SYSCHECKS                     |
...
...
...
APP                 |CPUCORES                      |
APP                 |DATABASE                      |
APP                 |DATABASEHOME                  |
APP                 |DB                            |
APP                 |DBHOME                        |
APP                 |DBNODE                        |
APP                 |DBSTORAGEDETAILS              |
APP                 |DBSTORAGEDETAILS_VOLS         |
APP                 |DBSTORAGELOCATIONS            |
APP                 |DCSPARAMETERS                 |
APP                 |DCS_USER                      |
APP                 |DG_CONFIGURATION              |
APP                 |DG_CONFIGURATION_REPLICATION_&|
APP                 |DISK                          |
APP                 |DISKGROUP                     |
APP                 |DISKINFO                      |
APP                 |FILEMULTIUPLOADPARTSRECORD    |
APP                 |FILEMULTIUPLOADRECORD         |
APP                 |FS                            |
APP                 |GI                            |
APP                 |GIHOME                        |
APP                 |GROUPENTITY                   |
APP                 |IDEMPOTENCYMAP                |
APP                 |JOBEXECUTION                  |
APP                 |JOBSCHEDULE                   |
APP                 |JOB_REPORT                    |
APP                 |JOB_RESOURCE_INFO             |
APP                 |LOGCLEANPOLICY                |
APP                 |LOGCLEANUPSUMMARY             |
APP                 |NETSECURITYRULES              |
APP                 |NETSECURITY_ENCRYPTIONALGORIT&|
APP                 |NETSECURITY_INTEGRITYALGORITH&|
APP                 |NETWORK                       |
APP                 |NETWORKINTERFACE              |
APP                 |NETWORKINTERFACE_INTERFACEMEM&|
APP                 |NETWORK_NETWORKTYPE           |
APP                 |OBJECTSTORESWIFT              |
APP                 |OBSERVER                      |
...
...
...

93 rows selected
ij>

Exiting derby database connection

To exit the tool, use following command :

ij> exit;

Getting last version of derby jar

One time, the derby jar file provided by the oracle support team was not the correct version. As per there guidance, I had to run :

[root@ODA03 repo]# java -cp /root/derbyjar/derby.jar:/root/derbyjar/derbytools.jar org.apache.derby.tools.ij
ij version 10.11
ij> connect 'jdbc:derby:node_0';

I got following errors :

ERROR XJ040: Failed to start database 'node_0' with class loader sun.misc.Launcher$AppClassLoader@42a57993, see the next exception for details.
ERROR XSLAN: Database at /opt/oracle/dcs/repo/node_0 has an incompatible format with the current version of the software.  The database was created by or upgraded by version 10.14.

I could download the last and appropriate derby jar from the apache web site. I could then successfully connect using correct version of derby jar related to the current ODA database.

[root@ODA03 repo]# java -cp /root/derbyjar/derby.jar:/root/derbyjar/derbytools.jar org.apache.derby.tools.ij
ij version 10.11
ij> connect 'jdbc:derby:node_0';
ij>

Conclusion

It might be interesting to know how to connect to metadata database from the ODA in order to check, troubleshoot and understand some internal ODA behavior. BUT, remember, that no update should be done in this database without Oracle support agreement and without running previously a backup of the repository.

Cet article Connecting to ODA derby database est apparu en premier sur Blog dbi services.

Having multiple standby databases and cascading with dbvisit

$
0
0

Dbvisit standy is a disaster recovery solution that you will be able to use with Oracle standard edition. I have been working on a customer project where I had to setup a system having one primary and two standby databases. One of the standby database had to run with a gap of 24 hours. Knowing that flashback possibilities are very limited on standard edition, this would give customer the ability to extract and restore some data been wrongly lost following human errors.

The initial configuration would be the following one :

Database instance, db_name : MyDB
MyDB_02 (db_unique_name) primary database running on srv02 server.
MyDB_01 (db_unique_name) expected standby database running on srv01 server.
MyDB_03 (db_unique_name) expected standby database running on srv03 server.

The following DDC configuration file will be used :
MyDBSTD1 : Configuration file for first standby been synchronized every 10 minutes.
MyDBSTD2 : Configuration file for second standby been synchronized every 24 hours.

Let me walk you through the steps to setup such configuration. This article is not intended to show the whole process of implementing a dbvisit solution, but only the steps required to work with multiple standby. We will also talk about how we can implement cascaded standby and apply lag delay within dbvisit.

Recommendations

In order to limit the manual configuration changes in the DDC file after a switchover, it is recommended to use as much as possible same ORACLE_HOME, ARCHIVE Destination and DBVISIT home directory.

Creating MyDBSTD1 DDC configuration file

The first standby configuration file will be created and used between MyDB_03 (srv03) and MyDB_02 (srv02).

oracle@srv02:/u01/app/dbvisit/standby/ [MyDB] ./dbvctl -o setup


=========================================================

     Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd)
           http://www.dbvisit.com

=========================================================

=>dbvctl only needs to be run on the primary server.

Is this the primary server?  [Yes]:
The following Dbvisit Database configuration (DDC) file(s) found on this
server:

     DDC
     ===
1)   Create New DDC
2)   Cancel

Please enter choice [] : 1

Is this correct?  [Yes]:

...
...
...

Below are the list of configuration variables provided during the setup process:

Configuration Variable             Value Provided
======================             ==============
ORACLE_SID                         MyDB
ORACLE_HOME                        /opt/oracle/product/12.2.0

SOURCE                             srv02
ARCHSOURCE                         /u03/app/oracle/dbvisit_arch/MyDB
RAC_DR                             N
USE_SSH                            N
DESTINATION                        srv03
NETPORT                            7890
DBVISIT_BASE_DR                    /u01/app/dbvisit
ORACLE_HOME_DR                     /u01/app/oracle/product/12.2.0.1/dbhome_1
DB_UNIQUE_NAME_DR                  MyDB_03
ARCHDEST                           /u03/app/oracle/dbvisit_arch/MyDB
ORACLE_SID_DR                      MyDB
ENV_FILE                           MyDBSTD1

Are these variables correct?  [Yes]:

>>> Dbvisit Database configuration (DDC) file MyDBSTD1 created.

>>> Dbvisit Database repository (DDR) MyDB created.
   Repository Version          8.4
   Software Version            8.4
   Repository Status           VALID


Do you want to enter license key for the newly created Dbvisit Database configuration (DDC) file?  [Yes]:

Enter license key and press Enter: []: XXXXXXXXXXXXXXXXXXXXXXXXXXX
>>> Dbvisit Standby License
License Key     : XXXXXXXXXXXXXXXXXXXXXXXXXXX
customer_number : XXXXXX
dbname          : MyDB
expiry_date     : 2099-05-06
product_id      : 8
sequence        : 1
status          : VALID
updated         : YES

PID:423545
TRACE:dbvisit_install.log

Synchronizing both MyDB_02 and MyDB_03

Shippping logs from primary to standby

oracle@srv02:/u01/app/dbvisit/standby/ [rdbms12201] ./dbvctl -d MyDBSTD1
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 326409)
dbvctl started on srv02: Mon May 20 16:29:14 2019
=============================================================

>>> Obtaining information from standby database (RUN_INSPECT=Y)... done
    Thread: 1 Archive log gap: 30. Transfer log gap: 58080
>>> Sending heartbeat message... skipped
>>> First time Dbvisit Standby runs, Dbvisit Standby configuration will be copied to
    srv03...
>>> Transferring Log file(s) from MyDB on srv02 to srv03 for thread 1:

    thread 1 sequence 58051 (1_58051_987102791.dbf)
    thread 1 sequence 58052 (1_58052_987102791.dbf)
...
...
...
    thread 1 sequence 58079 (1_58079_987102791.dbf)
    thread 1 sequence 58080 (1_58080_987102791.dbf)

=============================================================
dbvctl ended on srv02: Mon May 20 16:30:50 2019
=============================================================

Applying log on standby database

oracle@srv03:/u01/app/dbvisit/standby/ [MyDB] ./dbvctl -d MyDBSTD1
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 21504)
dbvctl started on srv03: Mon May 20 16:33:42 2019
=============================================================

>>> Sending heartbeat message... skipped

>>> Applying Log file(s) from srv02 to MyDB on srv03:

    thread 1 sequence 58051 (1_58051_987102791.arc)
    thread 1 sequence 58052 (1_58052_987102791.arc)
...
...
...
    thread 1 sequence 58079 (1_58079_987102791.arc)
    thread 1 sequence 58080 (1_58080_987102791.arc)
    Last applied log(s):
    thread 1 sequence 58080

    Next SCN required for recovery 49719323442 generated at 2019-05-20:16:27:09 +02:00.
    Next required log thread 1 sequence 58081

=============================================================
dbvctl ended on srv03: Mon May 20 16:36:52 2019
=============================================================

Running a gap report

oracle@srv02:/u01/app/dbvisit/standby/ [MyDB] ./dbvctl -d MyDBSTD1 -i
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 335068)
dbvctl started on srv02: Mon May 20 16:37:53 2019
=============================================================


Dbvisit Standby log gap report for MyDB thread 1 at 201905201637:
-------------------------------------------------------------
Destination database on srv03 is at sequence: 58081.
Source database on srv02 is at log sequence: 58082.
Source database on srv02 is at archived log sequence: 58081.
Dbvisit Standby last transfer log sequence: 58081.
Dbvisit Standby last transfer at: 2019-05-20 16:37:36.

Archive log gap for thread 1:  0.
Transfer log gap for thread 1: 0.
Standby database time lag (DAYS-HH:MI:SS): +00:00:01.


=============================================================
dbvctl ended on srv02: Mon May 20 16:37:57 2019
=============================================================

Switchover to srv03

At that time in the project we did a switchover to the newly created srv03 in order to test its stability. The switchover has been performed as described below, but this step is not mandatory when implementing several standby databases. As best practices, we will always test the first configuration by running a switchover before moving forward.

oracle@srv02:/u01/app/dbvisit/standby/ [MyDB] ./dbvctl -d MyDBSTD1 -o switchover
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 12196)
dbvctl started on srv02: Tue May 28 00:07:34 2019
=============================================================

>>> Starting Switchover between srv02 and srv03

Running pre-checks       ... done
Pre processing           ... done
Processing primary       ... done
Processing standby       ... done
Converting standby       ... done
Converting primary       ... done
Completing               ... done
Synchronizing            ... done
Post processing          ... done

>>> Graceful switchover completed.
    Primary Database Server: srv03
    Standby Database Server: srv02

>>> Dbvisit Standby can be run as per normal:
    dbvctl -d MyDBSTD1


PID:12196
TRACE:12196_dbvctl_switchover_MyDBSTD1_201905280007.trc

=============================================================
dbvctl ended on srv02: Tue May 28 00:13:31 2019
=============================================================

srv03 is now the new primary and srv02 a new standby database.

Creating MyDBSTD2 DDC configuration file

Once myDB_01 standby database is up and running, we can create its related DDC configuration file. To do so, we simply copy previous DDC configuration file, MyDBSTD1, and update it as needed.

I first transferred the file from current primary srv03 to new standby server srv01 :

oracle@srv03:/u01/app/dbvisit/standby/conf/ [MyDB] scp dbv_MyDBSTD1.env oracle@srv01:$PWD
dbv_MyDBSTD1.env		100% 	23KB 	22.7KB/s 		00:00

I copied it into the new DDC configuration file name :

oracle@srv01:/u01/app/dbvisit/standby/conf/ [MyDB] cp dbv_MyDBSTD1.env dbv_MyDBSTD2.env

I updated new DDC configuration accordingly to have :

  • DESTINATION as srv01 instead of srv02
  • DB_UNIQUE_NAME_DR as MyDB_01 instead of MyDB_02
  • MAILCFG to see the alerts coming from STD2 configuration.

The primary will remain the same : srv03.

oracle@srv01:/u01/app/dbvisit/standby/conf/ [MyDB] vi dbv_MyDBSTD2.env

oracle@srv01:/u01/app/dbvisit/standby/conf/ [MyDB] diff dbv_MyDBSTD1.env dbv_MyDBSTD2.env
86c86
DESTINATION = srv02
---
DESTINATION = srv01
93c93
DB_UNIQUE_NAME_DR = MyDB
---
DB_UNIQUE_NAME_DR = MyDB_01
135,136c135,136
MAILCFG_FROM = dbvisit_conf_1@domain.name MAILCFG_FROM_DR = dbvisit_conf_1@domain.name
---
MAILCFG_FROM = dbvisit_conf_2@domain.name
MAILCFG_FROM_DR = dbvisit_conf_2@domain.name

In case the ORACLE_HOME and ARCHIVE destination are not the same, these parameters will have to be updated as well.

Synchronizing both MyDB_03 and MyDB_01

Shippping logs from primary to standby

oracle@srv03:/u01/app/dbvisit/standby/ [MyDB] ./dbvctl -d MyDBSTD2
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 25914)
dbvctl started on srv03: Wed Jun  5 20:32:09 2019
=============================================================

>>> Obtaining information from standby database (RUN_INSPECT=Y)... done
    Thread: 1 Archive log gap: 383. Transfer log gap: 67385
>>> Sending heartbeat message... done
>>> First time Dbvisit Standby runs, Dbvisit Standby configuration will be copied to
    srv01...
>>> Transferring Log file(s) from MyDB on srv03 to srv01 for thread 1:

    thread 1 sequence 67003 (o1_mf_1_67003_ghgwj0z2_.arc)
    thread 1 sequence 67004 (o1_mf_1_67004_ghgwmj1w_.arc)
...
...
...
    thread 1 sequence 67384 (o1_mf_1_67384_ghj2fbgj_.arc)
    thread 1 sequence 67385 (o1_mf_1_67385_ghj2g883_.arc)

=============================================================
dbvctl ended on srv03: Wed Jun  5 20:42:05 2019
=============================================================

Applying log on standby database

oracle@srv01:/u01/app/dbvisit/standby/ [MyDB] ./dbvctl -d MyDBSTD2
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 69764)
dbvctl started on srv01: Wed Jun  5 20:42:45 2019
=============================================================

>>> Sending heartbeat message... done

>>> Applying Log file(s) from srv03 to MyDB on srv01:

    thread 1 sequence 67003 (1_67003_987102791.arc)
    thread 1 sequence 67004 (1_67004_987102791.arc)
...
...
...
    thread 1 sequence 67384 (1_67384_987102791.arc)
    thread 1 sequence 67385 (1_67385_987102791.arc)
    Last applied log(s):
    thread 1 sequence 67385

    Next SCN required for recovery 50112484332 generated at 2019-06-05:20:28:24 +02:00.
    Next required log thread 1 sequence 67386

>>> Dbvisit Archive Management Module (AMM)

    Config: number of archives to keep      = 0
    Config: number of days to keep archives = 3
    Config: diskspace full threshold        = 80%
==========

Processing /u03/app/oracle/dbvisit_arch/MyDB...
    Archive log dir: /u03/app/oracle/dbvisit_arch/MyDB
    Total number of archive files   : 383
    Number of archive logs deleted = 0
    Current Disk percent full       : 8%

=============================================================
dbvctl ended on srv01: Wed Jun  5 21:16:30 2019
=============================================================

Running a gap report

oracle@srv03:/u01/app/dbvisit/standby/ [MyDB] ./dbvctl -d MyDBSTD2 -i
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 44143)
dbvctl started on srv03: Wed Jun  5 21:17:03 2019
=============================================================


Dbvisit Standby log gap report for MyDB_03 thread 1 at 201906052117:
-------------------------------------------------------------
Destination database on srv01 is at sequence: 67385.
Source database on srv03 is at log sequence: 67387.
Source database on srv03 is at archived log sequence: 67386.
Dbvisit Standby last transfer log sequence: 67385.
Dbvisit Standby last transfer at: 2019-06-05 20:42:05.

Archive log gap for thread 1:  1.
Transfer log gap for thread 1: 1.
Standby database time lag (DAYS-HH:MI:SS): +00:48:41.

Switchover to srv01

Now we are having both srv01 and srv02 standby databases up and running and connected with current srv03 primary database. Let’s switchover to srv01 and see what would be the required steps. After each switchover the other standby DDC configuration files will have to be manually updated.

Checking srv03 and srv02 are synchronized

Both srv03 and srv02 databases should be in sync otherwise ship and apply archive logs.

oracle@srv03:/u01/app/dbvisit/standby/ [MyDB] ./dbvctl -d MyDBSTD1 -i
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 93307)
dbvctl started on srv03: Wed Jun  5 21:27:02 2019
=============================================================


Dbvisit Standby log gap report for MyDB_03 thread 1 at 201906052127:
-------------------------------------------------------------
Destination database on srv02 is at sequence: 67386.
Source database on srv03 is at log sequence: 67387.
Source database on srv03 is at archived log sequence: 67386.
Dbvisit Standby last transfer log sequence: 67386.
Dbvisit Standby last transfer at: 2019-06-05 21:24:47.

Archive log gap for thread 1:  0.
Transfer log gap for thread 1: 0.
Standby database time lag (DAYS-HH:MI:SS): +00:27:02.


=============================================================
dbvctl ended on srvxdb03: Wed Jun  5 21:27:08 2019
=============================================================

Checking srv03 and srv01 are synchronized

Both srv03 and srv01 databases should be in sync otherwise ship and apply archive logs.

oracle@srv03:/u01/app/dbvisit/standby/ [MyDB] ./dbvctl -d MyDBSTD2 -i
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 90871)
dbvctl started on srv03: Wed Jun  5 21:26:31 2019
=============================================================


Dbvisit Standby log gap report for MyDB_03 thread 1 at 201906052126:
-------------------------------------------------------------
Destination database on srv01 is at sequence: 67386.
Source database on srv03 is at log sequence: 67387.
Source database on srv03 is at archived log sequence: 67386.
Dbvisit Standby last transfer log sequence: 67386.
Dbvisit Standby last transfer at: 2019-06-05 21:26:02.

Archive log gap for thread 1:  0.
Transfer log gap for thread 1: 0.
Standby database time lag (DAYS-HH:MI:SS): +00:26:02.

Switchover to srv01

oracle@srv03:/u01/app/dbvisit/standby/ [MyDB] ./dbvctl -d MyDBSTD2 -o switchover
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 20334)
dbvctl started on srv03: Wed Jun  5 21:31:56 2019
=============================================================

>>> Starting Switchover between srv03 and srv01

Running pre-checks       ... done
Pre processing           ... done
Processing primary       ... done
Processing standby       ... done
Converting standby       ... done
Converting primary       ... done
Completing               ... done
Synchronizing            ... done
Post processing          ... done

>>> Graceful switchover completed.
    Primary Database Server: srv01
    Standby Database Server: srv03

>>> Dbvisit Standby can be run as per normal:
    dbvctl -d MyDBSTD2


PID:20334
TRACE:20334_dbvctl_switchover_MyDBSTD2_201906052131.trc

=============================================================
dbvctl ended on srv03: Wed Jun  5 21:37:40 2019
=============================================================

Attach srv02 to srv01 (new primary)

Previously to the switchover :

  • srv03 and srv01 was using MyDBSTD2 DDC configuration file
  • srv03 and srv02 was using MyDBSTD1 DDC configuration file

srv02 standby database needs now to be attach to new primary srv01. For this we will copy the MyDBSTD1 DDC configuration file from srv02 to srv01 as it is the first time srv01 is primary. Otherwise, we would only need to update accordingly the already existing file.

I have been transferring the DDC file :

oracle@srv02:/u01/app/dbvisit/standby/conf/ [MyDB] scp dbv_MyDBSTD1.env oracle@srv01:$PWD
dbv_MyDBSTD1.env    100%   23KB  14.8MB/s   00:00

MyDBSTD1 configuration file has been updated accordingly to reflect new changes and configuration :

  • SOURCE needs to be replaced from srv03 to srv01
  • DESTINATION will remain srv02
  • DB_UNIQUE_NAME needs to be replaced fromMyDB_03 to MyDB_01
  • DB_UNIQUE_NAME_DR will remain MyDB_02
oracle@srv01:/u01/app/dbvisit/standby/conf/ [MyDB] vi dbv_MyDBSTD1.env

oracle@srv01:/u01/app/dbvisit/standby/conf/ [MyDB] grep ^SOURCE dbv_MyDBSTD1.env
SOURCE = srv01

oracle@srv01:/u01/app/dbvisit/standby/conf/ [MyDB] grep DB_UNIQUE_NAME dbv_MyDBSTD1.env
# DB_UNIQUE_NAME      - Primary database db_unique_name
DB_UNIQUE_NAME = MyDB_01
# DB_UNIQUE_NAME_DR   - Standby database db_unique_name
DB_UNIQUE_NAME_DR = MyDB_02

Checking that databases are all synchronized

After performing several switch logfile on the primary in order to generate archive logs, I transferred and applied needed archive log files on both srv02 and srv03 standby databases. I made sure both are synchronized.

srv01 and srv03 databases :

oracle@srv01:/u01/app/dbvisit/standby/ [MyDB] ./dbvctl -d MyDBSTD2 -i
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 98156)
dbvctl started on srv01: Wed Jun  5 21:52:08 2019
=============================================================


Dbvisit Standby log gap report for MyDB_01 thread 1 at 201906052152:
-------------------------------------------------------------
Destination database on srv03 is at sequence: 67413.
Source database on srv01 is at log sequence: 67414.
Source database on srv01 is at archived log sequence: 67413.
Dbvisit Standby last transfer log sequence: 67413.
Dbvisit Standby last transfer at: 2019-06-05 21:51:13.

Archive log gap for thread 1:  0.
Transfer log gap for thread 1: 0.
Standby database time lag (DAYS-HH:MI:SS): +00:00:00.


=============================================================
dbvctl ended on srv01: Wed Jun  5 21:52:18 2019
=============================================================

srv01 and srv02 databases :

oracle@srv01:/u01/app/dbvisit/standby/ [MyDB] ./dbvctl -d MyDBSTD1 -i
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 100393)
dbvctl started on srv01: Wed Jun  5 21:56:06 2019
=============================================================


Dbvisit Standby log gap report for MyDB_01 thread 1 at 201906052156:
-------------------------------------------------------------
Destination database on srv02 is at sequence: 67413.
Source database on srv01 is at log sequence: 67414.
Source database on srv01 is at archived log sequence: 67413.
Dbvisit Standby last transfer log sequence: 67413.
Dbvisit Standby last transfer at: 2019-06-05 21:55:22.

Archive log gap for thread 1:  0.
Transfer log gap for thread 1: 0.
Standby database time lag (DAYS-HH:MI:SS): +00:05:13.


=============================================================
dbvctl ended on srv01: Wed Jun  5 21:56:07 2019
=============================================================

Apply delay lag

MyDBSTD2 configuration should at the end have an apply lag of 24 hours. This can be achieved using APPLY_DELAY_LAG_MINUTES in the configuration. In order to test it, I have decided with customer to use 60 minutes delay.

Update MyDBSTD2 DDC configuration file

Following parameters have been updated in the configuration :
APPLY_DELAY_LAG_MINUTES = 60
DMN_MONITOR_INTERVAL_DR = 0
TRANSFER_LOG_GAP_THRESHOLD = 0
ARCHIVE_LOG_GAP_THRESHOLD = 60

APPLY_DELAY_LAG_MINUTES is the delay in minutes to take in account before applying the vector changes.
DMN_MONITOR_INTERVAL_DR is the interval in sec for log monitor schedule on destination. 0 mean deactivated.
TRANSFER_LOG_GAP_THRESHOLD is the difference allowed between the last archived sequence on the primary and the last sequence transferred to the standby server.
ARCHIVE_LOG_GAP_THRESHOLD is the difference allowed between the last archived sequence on the primary and the last applied sequence on the standby database before an alert is sent.

oracle@srv03:/u01/app/dbvisit/standby/conf/ [MyDB] cp dbv_MyDBSTD2.env dbv_MyDBSTD2.env.201906131343

oracle@srv03:/u01/app/dbvisit/standby/conf/ [MyDB] vi dbv_MyDBSTD2.env

oracle@srv03:/u01/app/dbvisit/standby/conf/ [MyDB] diff dbv_MyDBSTD2.env dbv_MyDBSTD2.env.201906131343
281c281
DMN_MONITOR_INTERVAL_DR = 0
---
DMN_MONITOR_INTERVAL_DR = 5
331c331
APPLY_DELAY_LAG_MINUTES = 60
---
APPLY_DELAY_LAG_MINUTES = 0
374c374
ARCHIVE_LOG_GAP_THRESHOLD = 60
---
ARCHIVE_LOG_GAP_THRESHOLD = 0

oracle@srv03:/u01/app/dbvisit/standby/conf/ [MyDB] grep ^TRANSFER_LOG_GAP_THRESHOLD dbv_MyDBSTD2.env
TRANSFER_LOG_GAP_THRESHOLD = 0

Report displayed with an apply delay lag been configured

When generating a report, we can see that there is no gap in the log transfer as the archive log would be transferred through the crontab every 10 minutes. On the other side, we can see that there is an expected delay of 60 minutes in applying the logs.

oracle@srv03:/u01/app/dbvisit/standby/ [MyDBTEST] ./dbvctl -d MyDBSTD2 -i
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 66003)
dbvctl started on srv03: Thu Jun 13 15:21:29 2019
=============================================================


Dbvisit Standby log gap report for MyDB_03 thread 1 at 201906131521:
-------------------------------------------------------------
Destination database on srv01 is at sequence: 73856.
Source database on srv03 is at log sequence: 73890.
Source database on srv03 is at archived log sequence: 73889.
Dbvisit Standby last transfer log sequence: 73889.
Dbvisit Standby last transfer at: 2019-06-13 15:20:15.

Archive log gap for thread 1:  33 (apply_delay_lag_minutes=60).
Transfer log gap for thread 1: 0.
Standby database time lag (DAYS-HH:MI:SS): +01:00:00.


=============================================================
dbvctl ended on srv03: Thu Jun 13 15:21:35 2019
=============================================================

Cascading standby database

What about cascading standby database? Cascading standby database is possible with dbvisit. We would be using a cascaded standby for a reporting server that needs to be updated less frequently or if we would like to unload the primary database in sending archive logs to multiple standby databases. The cascaded standby database will remain updated through the first standby. Cascading is possible since dbvisit version 8.

Following needs to be known :

  • Switchover will not be possible between the primary and the cascaded standby database.
  • The DDC configuration file between the first standby and the cascaded standby needs to have :
    • As SOURCE the first standby database
    • CASCADE parameter set to Y. This will be done automatically when creating the DDC configuration with dbvctl -o setup. From the traces you will see : >>> Source database is a standby database. CASCADE flag will be turned on.
    • ARCHDEST and ARCHSOURCE location on the first standby needs to have same values.

    The principle is then exactly the same, and running dbvctl -d from the first standby will ship the archive log to the second standby.

I had been running some tests in my lab.

Environment

DBVP is the primary server.
DBVS is the first standby server.
DBVS2 is the second cascaded server.

oracle@DBVP:/u01/app/dbvisit/standby/ [DBVPDB] DBVPDB
********* dbi services Ltd. *********
STATUS                 : OPEN
DB_UNIQUE_NAME         : DBVPDB_SITE1
OPEN_MODE              : READ WRITE
LOG_MODE               : ARCHIVELOG
DATABASE_ROLE          : PRIMARY
FLASHBACK_ON           : NO
FORCE_LOGGING          : YES
VERSION                : 12.2.0.1.0
CDB Enabled            : NO
*************************************

oracle@DBVS:/u01/app/dbvisit/standby/ [DBVPDB] DBVPDB
********* dbi services Ltd. *********
STATUS                 : MOUNTED
DB_UNIQUE_NAME         : DBVPDB_SITE2
OPEN_MODE              : MOUNTED
LOG_MODE               : ARCHIVELOG
DATABASE_ROLE          : PHYSICAL STANDBY
FLASHBACK_ON           : NO
FORCE_LOGGING          : YES
CDB Enabled            : NO
*************************************


oracle@DBVS2:/u01/app/dbvisit/standby/ [DBVPDB] DBVPDB
********* dbi services Ltd. *********
STATUS                 : MOUNTED
DB_UNIQUE_NAME         : DBVPDB_SITE3
OPEN_MODE              : MOUNTED
LOG_MODE               : ARCHIVELOG
DATABASE_ROLE          : PHYSICAL STANDBY
FLASHBACK_ON           : NO
FORCE_LOGGING          : YES
CDB Enabled            : NO
*************************************

Create cascaded DDC configuration file

The DDC configuration file will be created from the first standby node.
DBVS (first standby server) will be the SOURCE.
DBVS2 (cascaded standby server) will be the DESTINATION.

oracle@DBVS:/u01/app/dbvisit/standby/ [DBVPDB] ./dbvctl -o setup


=========================================================

     Dbvisit Standby Database Technology (8.0.20_0_g7e6bd51b)
           http://www.dbvisit.com

=========================================================

=>dbvctl only needs to be run on the primary server.

Is this the primary server?  [Yes]:
The following Dbvisit Database configuration (DDC) file(s) found on this
server:

     DDC
     ===
1)   Create New DDC
2)   DBVPDB
3)   DBVPDB_SITE1
4)   DBVPOMF_SITE1
5)   Cancel

Please enter choice [] : 1

Is this correct?  [Yes]:

...


Continue ?  [No]: yes

=========================================================
Dbvisit Standby setup begins.
=========================================================
The following Oracle instance(s) have been found on this server:

     SID            ORACLE_HOME
     ===            ===========
1)   rdbms12201     /u01/app/oracle/product/12.2.0/dbhome_1
2)   DBVPDB         /u01/app/oracle/product/12.2.0/dbhome_1
3)   DBVPOMF        /u01/app/oracle/product/12.2.0/dbhome_1
4)   DUP            /u01/app/oracle/product/12.2.0/dbhome_1
5)   Enter own ORACLE_SID and ORACLE_HOME
Please enter choice [] : 2

Is this correct?  [Yes]:
=>ORACLE_SID will be: DBVPDB
=>ORACLE_HOME will be: /u01/app/oracle/product/12.2.0/dbhome_1

>>> Source database is a standby database. CASCADE flag will be turned on.

Yes to continue or No to cancel setup?  [Yes]:

...
...
...

Below are the list of configuration variables provided during the setup process:

Configuration Variable             Value Provided
======================             ==============
ORACLE_SID                         DBVPDB
ORACLE_HOME                        /u01/app/oracle/product/12.2.0/dbhome_1

SOURCE                             DBVS
ARCHSOURCE                         /u90/dbvisit_arch/DBVPDB_SITE2
RAC_DR                             N
USE_SSH                            Y
DESTINATION                        DBVS2
NETPORT                            22
DBVISIT_BASE_DR                    /oracle/u01/app/dbvisit
ORACLE_HOME_DR                     /u01/app/oracle/product/12.2.0/dbhome_1
DB_UNIQUE_NAME_DR                  DBVPDB_SITE3
ARCHDEST                           /u90/dbvisit_arch/DBVPDB_SITE3
ORACLE_SID_DR                      DBVPDB
ENV_FILE                           DBVPDB_CASCADED

Are these variables correct?  [Yes]:

>>> Dbvisit Database configuration (DDC) file DBVPDB_CASCADED created.

>>> Dbvisit Database repository (DDR) already installed.
   Repository Version          8.3
   Software Version            8.3
   Repository Status           VALID


Do you want to enter license key for the newly created Dbvisit Database configuration (DDC) file?  [Yes]:

Enter license key and press Enter: []: 4jo6z-8aaai-u09b6-ijjxe-cxks5-1114a-ozfvp
oracle@dbvs2's password:
>>> Dbvisit Standby License
License Key     : 4jo6z-8aaai-u09b6-ijjxe-cxks5-1114a-ozfvp
customer_number : 1
dbname          :
expiry_date     : 2019-05-29
product_id      : 8
sequence        : 1
status          : VALID
updated         : YES

PID:25571
TRACE:dbvisit_install.log

dbvisit software could see that the SOURCE is already a standby database. The software will then automatically configured the CASCADE flag to Y.

>>> Source database is a standby database. CASCADE flag will be turned on.
oracle@DBVS:/u01/app/dbvisit/standby/conf/ [DBVPDB] grep CASCADE dbv_DBVPDB_CASCADED.env
# Variable: CASCADE
#      CASCADE = Y
CASCADE = Y

Synchronize first standby with primary

Ship archive log from primary to first standby
oracle@DBVP:/u01/app/dbvisit/standby/ [DBVPDB] ./dbvctl -d DBVPDB
=============================================================
Dbvisit Standby Database Technology (8.0.20_0_g7e6bd51b) (pid 23506)
dbvctl started on DBVP: Wed May 15 01:24:55 2019
=============================================================

>>> Obtaining information from standby database (RUN_INSPECT=Y)... done
    Thread: 1 Archive log gap: 3. Transfer log gap: 3
>>> Transferring Log file(s) from DBVPDB on DBVP to DBVS for thread 1:

    thread 1 sequence 50 (o1_mf_1_50_gfpmk7sg_.arc)
    thread 1 sequence 51 (o1_mf_1_51_gfpmkc7p_.arc)
    thread 1 sequence 52 (o1_mf_1_52_gfpmkf7w_.arc)

=============================================================
dbvctl ended on DBVP: Wed May 15 01:25:06 2019
=============================================================
Apply archive log on first standby
oracle@DBVS:/u01/app/dbvisit/standby/ [DBVPDB] ./dbvctl -d DBVPDB
=============================================================
Dbvisit Standby Database Technology (8.0.20_0_g7e6bd51b) (pid 27769)
dbvctl started on DBVS: Wed May 15 01:25:25 2019
=============================================================


>>> Applying Log file(s) from DBVP to DBVPDB on DBVS:

>>> No new logs to apply.
    Last applied log(s):
    thread 1 sequence 52

    Next SCN required for recovery 885547 generated at 2019-05-15:01:24:29 +02:00.
    Next required log thread 1 sequence 53

=============================================================
dbvctl ended on DBVS: Wed May 15 01:25:27 2019
=============================================================
Run a gap report
oracle@DBVP:/u01/app/dbvisit/standby/ [DBVPDB] ./dbvctl -d DBVPDB -i
=============================================================
Dbvisit Standby Database Technology (8.0.20_0_g7e6bd51b) (pid 23625)
dbvctl started on DBVP: Wed May 15 01:25:55 2019
=============================================================


Dbvisit Standby log gap report for DBVPDB_SITE1 thread 1 at 201905150125:
-------------------------------------------------------------
Destination database on DBVS is at sequence: 52.
Source database on DBVP is at log sequence: 53.
Source database on DBVP is at archived log sequence: 52.
Dbvisit Standby last transfer log sequence: 52.
Dbvisit Standby last transfer at: 2019-05-15 01:25:06.

Archive log gap for thread 1:  0.
Transfer log gap for thread 1: 0.
Standby database time lag (DAYS-HH:MI:SS): +00:00:33.


=============================================================
dbvctl ended on DBVP: Wed May 15 01:25:58 2019
=============================================================

Synchronize cascaded standby with first standby

Ship archive log from first standby to cascaded standby
oracle@DBVS:/u01/app/dbvisit/standby/ [DBVPDB] ./dbvctl -d DBVPDB_CASCADED
=============================================================
Dbvisit Standby Database Technology (8.0.20_0_g7e6bd51b) (pid 27965)
dbvctl started on DBVS: Wed May 15 01:26:41 2019
=============================================================

>>> Obtaining information from standby database (RUN_INSPECT=Y)... done
    Thread: 1 Archive log gap: 3. Transfer log gap: 3
>>> Transferring Log file(s) from DBVPDB on DBVS to DBVS2 for thread 1:

    thread 1 sequence 50 (1_50_979494498.arc)
    thread 1 sequence 51 (1_51_979494498.arc)
    thread 1 sequence 52 (1_52_979494498.arc)

=============================================================
dbvctl ended on DBVS: Wed May 15 01:26:49 2019
=============================================================
Apply archive log on cascaded standby
oracle@DBVS2:/u01/app/dbvisit/standby/ [DBVPDB] ./dbvctl -d DBVPDB_CASCADED
=============================================================
Dbvisit Standby Database Technology (8.0.20_0_g7e6bd51b) (pid 21118)
dbvctl started on DBVS2: Wed May 15 01:27:21 2019
=============================================================


>>> Applying Log file(s) from DBVS to DBVPDB on DBVS2:

    thread 1 sequence 50 (1_50_979494498.arc)
    thread 1 sequence 51 (1_51_979494498.arc)
    thread 1 sequence 52 (1_52_979494498.arc)
    Last applied log(s):
    thread 1 sequence 52

    Next SCN required for recovery 885547 generated at 2019-05-15:01:24:29 +02:00.
    Next required log thread 1 sequence 53

=============================================================
dbvctl ended on DBVS2: Wed May 15 01:27:33 2019
=============================================================
Run a gap report
oracle@DBVS:/u01/app/dbvisit/standby/ [DBVPDB] ./dbvctl -d DBVPDB_CASCADED -i
=============================================================
Dbvisit Standby Database Technology (8.0.20_0_g7e6bd51b) (pid 28084)
dbvctl started on DBVS: Wed May 15 01:28:07 2019
=============================================================


Dbvisit Standby log gap report for DBVPDB_SITE2 thread 1 at 201905150128:
-------------------------------------------------------------
Destination database on DBVS2 is at sequence: 52.
Source database on DBVS is at applied log sequence: 52.
Dbvisit Standby last transfer log sequence: 52.
Dbvisit Standby last transfer at: 2019-05-15 01:26:49.

Archive log gap for thread 1:  0.
Transfer log gap for thread 1: 0.
Standby database time lag (DAYS-HH:MI:SS): +00:00:00.


=============================================================
dbvctl ended on DBVS: Wed May 15 01:28:11 2019
=============================================================

Conclusion

With dbvisit we are able to configure several standby databases, choose apply lag delay and also configure cascaded standby. The cons would be that the DDC configuration file needs to be manually adapted after each switchover.

Cet article Having multiple standby databases and cascading with dbvisit est apparu en premier sur Blog dbi services.

Adding a dbvisit standby database on the ODA in a non-OMF environment

$
0
0

I have recently been working on a customer project where I had been challenged adding a dbvisit standby database on an ODA X7-2M, named ODA03. The existing customer environment was composed of Oracle Standard 12.2 version database. The primary database, myDB, is running on server named srv02 and using a non-OMF configuration. On the ODA side we are working with OMF configuration. The dbvisit version available at that time was version 8. You need to know that version 9 is currently the last one and brings some new cool features. Through this blog I would like to share with you my experience, the problem I have been facing and the solution I could put in place.

Preparing the instance on the ODA

First of all I have been creating an instance only database on the ODA.

root@ODA03 ~]# odacli list-dbhomes

ID                                       Name                 DB Version                               Home Location                                 Status   
---------------------------------------- -------------------- ---------------------------------------- --------------------------------------------- ----------
ec33e32a-37d1-4d0d-8c40-b358dcf5660c     OraDB12201_home1     12.2.0.1.180717                          /u01/app/oracle/product/12.2.0.1/dbhome_1     Configured

[root@ODA03 ~]# odacli create-database -m -u myDB_03 -dn domain.name -n myDB -r ACFS -io -dh ec33e32a-37d1-4d0d-8c40-b358dcf5660c
Password for SYS,SYSTEM and PDB Admin:

Job details
----------------------------------------------------------------
                     ID:  96fd4d07-4604-4158-9c25-702c01f4493e
            Description:  Database service creation with db name: myDB
                 Status:  Created
                Created:  May 15, 2019 4:29:15 PM CEST
                Message:

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------

[root@ODA03 ~]# odacli describe-job -i 96fd4d07-4604-4158-9c25-702c01f4493e

Job details
----------------------------------------------------------------
                     ID:  96fd4d07-4604-4158-9c25-702c01f4493e
            Description:  Database service creation with db name: myDB
                 Status:  Success
                Created:  May 15, 2019 4:29:15 PM CEST
                Message:

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------
Setting up ssh equivalance               May 15, 2019 4:29:16 PM CEST        May 15, 2019 4:29:16 PM CEST        Success
Creating volume datmyDB                    May 15, 2019 4:29:16 PM CEST        May 15, 2019 4:29:38 PM CEST        Success
Creating volume reco                     May 15, 2019 4:29:38 PM CEST        May 15, 2019 4:30:00 PM CEST        Success
Creating ACFS filesystem for DATA        May 15, 2019 4:30:00 PM CEST        May 15, 2019 4:30:17 PM CEST        Success
Creating ACFS filesystem for RECO        May 15, 2019 4:30:17 PM CEST        May 15, 2019 4:30:35 PM CEST        Success
Database Service creation                May 15, 2019 4:30:35 PM CEST        May 15, 2019 4:30:51 PM CEST        Success
Auxiliary Instance Creation              May 15, 2019 4:30:35 PM CEST        May 15, 2019 4:30:47 PM CEST        Success
password file creation                   May 15, 2019 4:30:47 PM CEST        May 15, 2019 4:30:49 PM CEST        Success
archive and redo log location creation   May 15, 2019 4:30:49 PM CEST        May 15, 2019 4:30:49 PM CEST        Success
updating the Database version            May 15, 2019 4:30:49 PM CEST        May 15, 2019 4:30:51 PM CEST        Success

Next steps are really common DBA operations :

  • Create a pfile from the current primary database
  • Transfer the pfile to the ODA
  • Update the pfile as needed (path, db_unique_name, …)
  • Create a spfile from the pfile on the new ODA database
  • Apply ODA specific instance parameters
  • Copy or create the password file with same password

The parameters that are mandatory to be set on the ODA instance are the following :
*.db_create_file_dest=’/u02/app/oracle/oradata/myDB_03′
*.db_create_online_log_dest_1=’/u03/app/oracle/redo’
*.db_recovery_file_dest=’/u03/app/oracle/fast_recovery_area’

Also all the convert parameters should be removed. Using convert parameter is incompatible with OMF.

Creating the standby database

Using dbvisit

I first tried to use dbvisit to create the standby database.

As usual and common dbvisit operation, I first created the DDC configuration file from the primary server :

oracle@srv02:/u01/app/dbvisit/standby/ [myDB] ./dbvctl -o setup
...
...
...
Below are the list of configuration variables provided during the setup process:

Configuration Variable             Value Provided
======================             ==============
ORACLE_SID                         myDB
ORACLE_HOME                        /opt/oracle/product/12.2.0

SOURCE                             srv02
ARCHSOURCE                         /u03/app/oracle/dbvisit_arch/myDB
RAC_DR                             N
USE_SSH                            N
DESTINATION                        ODA03
NETPORT                            7890
DBVISIT_BASE_DR                    /u01/app/dbvisit
ORACLE_HOME_DR                     /u01/app/oracle/product/12.2.0.1/dbhome_1
DB_UNIQUE_NAME_DR                  myDB_03
ARCHDEST                           /u03/app/oracle/dbvisit_arch/myDB
ORACLE_SID_DR                      myDB
ENV_FILE                           myDBSTD1

Are these variables correct?  [Yes]:
...
...
...

I then used this DDC configuration file to create the standby database :

oracle@srv02:/u01/app/dbvisit/standby/ [myDB] ./dbvctl -d myDBSTD1 --csd


-------------------------------------------------------------------------------

INIT ORA PARAMETERS
-------------------------------------------------------------------------------
*              audit_file_dest                         /u01/app/oracle/admin/myDB/adump
*              compatible                              12.2.0
*              control_management_pack_access          NONE
*              db_block_size                           8192
*              db_create_file_dest                     /u02/app/oracle/oradata/myDB_03
*              db_create_online_log_dest_1             /u03/app/oracle/redo
*              db_domain
*              db_name                                 myDB
*              db_recovery_file_dest                   /u03/app/oracle/fast_recovery_area
*              db_recovery_file_dest_size              240G
*              db_unique_name                          myDB_03
*              diagnostic_dest                         /u01/app/oracle
*              dispatchers                             (PROTOCOL=TCP) (SERVICE=myDBXDB)
*              instance_mode                           READ-WRITE
*              java_pool_size                          268435456
*              log_archive_dest_1                      LOCATION=USE_DB_RECOVERY_FILE_DEST
*              open_cursors                            3000
*              optimizer_features_enable               12.2.0.1
*              pga_aggregate_target                    4194304000
*              processes                               8000
*              remote_login_passwordfile               EXCLUSIVE
*              resource_limit                          TRUE
*              sessions                                7552
*              sga_max_size                            53687091200
*              sga_target                              26843545600
*              shared_pool_reserved_size               117440512
*              spfile                                  OS default
*              statistics_level                        TYPICAL
*              undo_retention                          300
*              undo_tablespace                         UNDOTBS1

-------------------------------------------------------------------------------

Status: VALID

What would you like to do:
   1 - Create standby database using existing saved template
   2 - View content of existing saved template
   3 - Return to the previous menu
   Please enter your choice [1]:

This operation failed with following errors :

Cannot create standby data or temp file /usr/oracle/oradata/myDB/myDB_bi_temp01.dbf for
primary file /usr/oracle/oradata/myDB/myDB_bi_temp01.dbf as location /usr/oracle/oradata/myDB
does not exist on the standby.

A per dbvisit documentation, dbvisit standby is certified ODA and fully compatible with non-OMF and OMF databases. This is correct, the only distinction is that the full environment needs to be in same configuration. That’s to say that if the primary is OMF, the standby is expected to be OMF. If the primary is running a non-OMF configuration, the standby should be using non-OMF as well.

Using RMAN

I decided to duplicate the database using RMAN and a backup that I transferred locally on the ODA. The backup was the previous nightly inc0 backup. Before running the rman duplication I executed a last archive log backup to make sure to have the most recent archive used in the duplication.

I’m taking this opportunity to highlight that, thanks to ODA NVMe technology, the duplication of the 3 TB database without multiple channel (standard edition) took a bit more than 2 hours only. On the existing servers this took about 10 hours.

I added following tns entry in the tnsnames.ora.

myDBSRV3 =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = ODA03.domain.name)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = myDB)
      (UR = A)
    )
  )

Of course I could have been using a local connection.

I made sure the database to be in nomount status and ran the rman duplication :

oracle@ODA03:/opt/oracle/backup/ [myDB] rmanh

Recovery Manager: Release 12.2.0.1.0 - Production on Mon May 20 13:24:29 2019

Copyright (c) 1982, 2017, Oracle and/or its affiliates.  All rights reserved.

RMAN> connect auxiliary sys@myDBSRV3

auxiliary database Password:
connected to auxiliary database: myDB (not mounted)

RMAN> run {
2> duplicate target database for standby dorecover backup location '/opt/oracle/backup/myDB';
3> }

Starting Duplicate Db at 20-MAY-2019 13:25:51

contents of Memory Script:
{
   sql clone "alter system set  control_files =
  ''/u03/app/oracle/redo/myDB_03/controlfile/o1_mf_gg4qvpnn_.ctl'' comment=
 ''Set by RMAN'' scope=spfile";
   restore clone standby controlfile from  '/opt/oracle/backup/myDB/ctl_myDB_myDB_s108013_p1_newbak.ctl';
}
executing Memory Script

sql statement: alter system set  control_files =   ''/u03/app/oracle/redo/myDB_03/controlfile/o1_mf_gg4qvpnn_.ctl'' comment= ''Set by RMAN'' scope=spfile

Starting restore at 20-MAY-2019 13:25:51
allocated channel: ORA_AUX_DISK_1
channel ORA_AUX_DISK_1: SID=9186 device type=DISK

channel ORA_AUX_DISK_1: restoring control file
channel ORA_AUX_DISK_1: restore complete, elapsed time: 00:00:01
output file name=/u03/app/oracle/redo/myDB_03/controlfile/o1_mf_gg4qvpnn_.ctl
Finished restore at 20-MAY-2019 13:25:52

contents of Memory Script:
{
   sql clone 'alter database mount standby database';
}
executing Memory Script

sql statement: alter database mount standby database
released channel: ORA_AUX_DISK_1
allocated channel: ORA_AUX_DISK_1
channel ORA_AUX_DISK_1: SID=9186 device type=DISK

contents of Memory Script:
{
   set until scn  49713361973;
   set newname for clone tempfile  1 to new;
   set newname for clone tempfile  2 to new;
   switch clone tempfile all;
   set newname for clone datafile  1 to new;
   set newname for clone datafile  2 to new;
   set newname for clone datafile  3 to new;
   set newname for clone datafile  4 to new;
   set newname for clone datafile  5 to new;
   set newname for clone datafile  6 to new;
   set newname for clone datafile  7 to new;
   set newname for clone datafile  8 to new;
   set newname for clone datafile  10 to new;
   set newname for clone datafile  11 to new;
   set newname for clone datafile  12 to new;
   set newname for clone datafile  13 to new;
   set newname for clone datafile  14 to new;
   set newname for clone datafile  15 to new;
   set newname for clone datafile  16 to new;
   set newname for clone datafile  17 to new;
   set newname for clone datafile  18 to new;
   restore
   clone database
   ;
}
executing Memory Script

executing command: SET until clause

executing command: SET NEWNAME

executing command: SET NEWNAME

renamed tempfile 1 to /u02/app/oracle/oradata/myDB_03/myDB_03/datafile/o1_mf_temp_%u_.tmp in control file
renamed tempfile 2 to /u02/app/oracle/oradata/myDB_03/myDB_03/datafile/o1_mf_lx_bi_te_%u_.tmp in control file

executing command: SET NEWNAME

...
...
...

executing command: SET NEWNAME

Starting restore at 20-MAY-2019 13:25:57
using channel ORA_AUX_DISK_1

channel ORA_AUX_DISK_1: starting datafile backup set restore
channel ORA_AUX_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_AUX_DISK_1: restoring datafile 00001 to /u02/app/oracle/oradata/myDB_03/myDB_03/datafile/o1_mf_system_%u_.dbf
channel ORA_AUX_DISK_1: restoring datafile 00003 to /u02/app/oracle/oradata/myDB_03/myDB_03/datafile/o1_mf_undotbs1_%u_.dbf
channel ORA_AUX_DISK_1: restoring datafile 00005 to /u02/app/oracle/oradata/myDB_03/myDB_03/datafile/o1_mf_lxdataid_%u_.dbf
channel ORA_AUX_DISK_1: restoring datafile 00006 to /u02/app/oracle/oradata/myDB_03/myDB_03/datafile/o1_mf_renderz2_%u_.dbf
channel ORA_AUX_DISK_1: restoring datafile 00007 to /u02/app/oracle/oradata/myDB_03/myDB_03/datafile/o1_mf_lx_ods_%u_.dbf
channel ORA_AUX_DISK_1: restoring datafile 00008 to /u02/app/oracle/oradata/myDB_03/myDB_03/datafile/o1_mf_users_%u_.dbf
channel ORA_AUX_DISK_1: restoring datafile 00013 to /u02/app/oracle/oradata/myDB_03/myDB_03/datafile/o1_mf_renderzs_%u_.dbf
channel ORA_AUX_DISK_1: restoring datafile 00015 to /u02/app/oracle/oradata/myDB_03/myDB_03/datafile/o1_mf_lx_stagi_%u_.dbf
channel ORA_AUX_DISK_1: reading from backup piece /opt/oracle/backup/myDB/inc0_myDB_s107963_p1
...
...
...
archived log file name=/opt/oracle/backup/myDB/1_58043_987102791.dbf thread=1 sequence=58043
archived log file name=/opt/oracle/backup/myDB/1_58044_987102791.dbf thread=1 sequence=58044
archived log file name=/opt/oracle/backup/myDB/1_58045_987102791.dbf thread=1 sequence=58045
archived log file name=/opt/oracle/backup/myDB/1_58046_987102791.dbf thread=1 sequence=58046
archived log file name=/opt/oracle/backup/myDB/1_58047_987102791.dbf thread=1 sequence=58047
archived log file name=/opt/oracle/backup/myDB/1_58048_987102791.dbf thread=1 sequence=58048
archived log file name=/opt/oracle/backup/myDB/1_58049_987102791.dbf thread=1 sequence=58049
archived log file name=/opt/oracle/backup/myDB/1_58050_987102791.dbf thread=1 sequence=58050
media recovery complete, elapsed time: 00:12:40
Finished recover at 20-MAY-2019 16:06:22
Finished Duplicate Db at 20-MAY-2019 16:06:39

I could check and see that my standby database has been successfully created on the ODA :

oracle@ODA03:/u01/app/oracle/local/dmk/etc/ [myDB] myDB
********* dbi services Ltd. *********
STATUS                 : MOUNTED
DB_UNIQUE_NAME         : myDB_03
OPEN_MODE              : MOUNTED
LOG_MODE               : ARCHIVELOG
DATABASE_ROLE          : PHYSICAL STANDBY
FLASHBACK_ON           : NO
FORCE_LOGGING          : YES
CDB Enabled            : NO
*************************************

As a personal note, I really found using oracle RMAN more convenient to duplicate a database. Albeit dbvisit script and tool is really stable, I think that this will give you more flexibility.

Registering the database in the grid cluster

As next step I registered the database in the grid.

oracle@ODA03:/u01/app/oracle/product/12.2.0.1/dbhome_1/dbs/ [LX] srvctl add database -db MyDB_03 -oraclehome /u01/app/oracle/product/12.2.0.1/dbhome_1 -dbtype SINGLE -instance MyDB -domain team-w.local -spfile /u02/app/oracle/oradata/MyDB_03/dbs/spfileMyDB.ora -pwfile /u01/app/oracle/product/12.2.0.1/dbhome_1/dbs/orapwMyDB -role PHYSICAL_STANDBY -startoption MOUNT -stopoption IMMEDIATE -dbname MyDB -node ODA03 -acfspath "/u02/app/oracle/oradata/MyDB_03,/u03/app/oracle"

I stopped the database :

SQL> shutdown immediate;
ORA-01109: database not open


Database dismounted.
ORACLE instance shut down.

And started it again with the grid infrastructure :

oracle@ODA03:/u01/app/oracle/product/12.2.0.1/dbhome_1/dbs/ [MyDB] MyDB
********* dbi services Ltd. *********
STATUS          : STOPPED
*************************************

oracle@ODA03:/u01/app/oracle/product/12.2.0.1/dbhome_1/dbs/ [MyDB] srvctl status database -d MyDB_03
Instance MyDB is not running on node ODA03

oracle@ODA03:/u01/app/oracle/product/12.2.0.1/dbhome_1/dbs/ [MyDB] srvctl start database -d MyDB_03

oracle@ODA03:/u01/app/oracle/product/12.2.0.1/dbhome_1/dbs/ [MyDB] srvctl status database -d MyDB_03
Instance MyDB is running on node ODA03

dbvisit synchronization

We now have our standby database created on the ODA. We just need to synchronize it with the primary.

Run a gap report

Executing a gap report, we can see that the newly created database is running almost 4 hours behind.

oracle@srv02:/u01/app/dbvisit/standby/ [rdbms12201] ./dbvctl -d myDBSTD1 -i
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 321953)
dbvctl started on srv02: Mon May 20 16:24:35 2019
=============================================================


Dbvisit Standby log gap report for myDB thread 1 at 201905201624:
-------------------------------------------------------------
Destination database on ODA03 is at sequence: 58050.
Source database on srv02 is at log sequence: 58080.
Source database on srv02 is at archived log sequence: 58079.
Dbvisit Standby last transfer log sequence: .
Dbvisit Standby last transfer at: .

Archive log gap for thread 1:  29.
Transfer log gap for thread 1: 58079.
Standby database time lag (DAYS-HH:MI:SS): +03:39:01.


=============================================================
dbvctl ended on srv02: Mon May 20 16:24:40 2019
=============================================================

Send the archive logs from primary to the standby database

I have been shipping the last archive logs from the primary database to the newly created standby.

oracle@srv02:/u01/app/dbvisit/standby/ [rdbms12201] ./dbvctl -d myDBSTD1
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 326409)
dbvctl started on srv02: Mon May 20 16:29:14 2019
=============================================================

>>> Obtaining information from standby database (RUN_INSPECT=Y)... done
    Thread: 1 Archive log gap: 30. Transfer log gap: 58080
>>> Sending heartbeat message... skipped
>>> First time Dbvisit Standby runs, Dbvisit Standby configuration will be copied to
    ODA03...
>>> Transferring Log file(s) from myDB on srv02 to ODA03 for thread 1:

    thread 1 sequence 58051 (1_58051_987102791.dbf)
    thread 1 sequence 58052 (1_58052_987102791.dbf)
...
...
...
    thread 1 sequence 58079 (1_58079_987102791.dbf)
    thread 1 sequence 58080 (1_58080_987102791.dbf)

=============================================================
dbvctl ended on srv02: Mon May 20 16:30:50 2019
=============================================================

Apply archive logs on the standby database

Then I could finally apply the archive logs on the standby database.

oracle@ODA03:/u01/app/dbvisit/standby/ [myDB] ./dbvctl -d myDBSTD1
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 21504)
dbvctl started on ODA03: Mon May 20 16:33:42 2019
=============================================================

>>> Sending heartbeat message... skipped

>>> Applying Log file(s) from srv02 to myDB on ODA03:

    thread 1 sequence 58051 (1_58051_987102791.arc)
    thread 1 sequence 58052 (1_58052_987102791.arc)
...
...
...
    thread 1 sequence 58079 (1_58079_987102791.arc)
    thread 1 sequence 58080 (1_58080_987102791.arc)
    Last applied log(s):
    thread 1 sequence 58080

    Next SCN required for recovery 49719323442 generated at 2019-05-20:16:27:09 +02:00.
    Next required log thread 1 sequence 58081

=============================================================
dbvctl ended on ODA03: Mon May 20 16:36:52 2019
=============================================================

Run a gap report

Running a new gap report, we can see that there is no delta between the primary and the standby database.

oracle@srv02:/u01/app/dbvisit/standby/ [rdbms12201] ./dbvctl -d myDBSTD1 -i
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 335068)
dbvctl started on srv02: Mon May 20 16:37:53 2019
=============================================================


Dbvisit Standby log gap report for myDB thread 1 at 201905201637:
-------------------------------------------------------------
Destination database on ODA03 is at sequence: 58081.
Source database on srv02 is at log sequence: 58082.
Source database on srv02 is at archived log sequence: 58081.
Dbvisit Standby last transfer log sequence: 58081.
Dbvisit Standby last transfer at: 2019-05-20 16:37:36.

Archive log gap for thread 1:  0.
Transfer log gap for thread 1: 0.
Standby database time lag (DAYS-HH:MI:SS): +00:00:01.


=============================================================
dbvctl ended on srv02: Mon May 20 16:37:57 2019
=============================================================

Preparing the database for switchover

Are we done? Absolutely not. In order to be able to successfully perform a switchover, 3 main modifications are mandatory on the non-ODA Server (running non-OMF database) :

  • The future database files should be OMF
  • The online redo log should be newly created
  • The temporary file should be newly created

Otherwise you might end with unsuccessfull switchover having below errors :

>>> Starting Switchover between srv02 and ODA03

Running pre-checks       ... failed
No rollback action required

>>> Database on server srv02 is still a Primary Database
>>> Database on server ODA03 is still a Standby Database


<<<>>>
PID:40386
TRACEFILE:40386_dbvctl_switchover_myDBSTD1_201905272153.trc
SERVER:srv02
ERROR_CODE:1
Remote execution error on ODA03.

====================Remote Output start: ODA03=====================
<<<>>>
PID:92292
TRACEFILE:92292_dbvctl_f_gs_get_info_standby_myDBSTD1_201905272153.trc
SERVER:ODA03
ERROR_CODE:2146
Dbvisit Standby cannot proceed:
Cannot create standby data or temp file /usr/oracle/oradata/myDB/temp01.dbf for primary
file /usr/oracle/oradata/myDB/temp01.dbf as location /usr/oracle/oradata/myDB does not
exist on the standby.
Cannot create standby data or temp file /usr/oracle/oradata/myDB/lx_bi_temp01.dbf for
primary file /usr/oracle/oradata/myDB/lx_bi_temp01.dbf as location /usr/oracle/oradata/myDB
does not exist on the standby.
Review the following standby database parameters:
        db_create_file_dest = /u02/app/oracle/oradata/myDB_03
        db_file_name_convert =
>>>> Dbvisit Standby terminated <<<>>> Dbvisit Standby terminated <<<<

Having new OMF configuration

There is no need to convert the full database into OMF. A database can run having both file naming configuration, non-OMF and OMF. We just need to have the database working now with OMF configuration. For this we will just apply the appropriate value to the init parameter. In my case the existing primary database was storing all data and redo files in the /opt/oracle/oradata directory.

SQL> alter system set DB_CREATE_FILE_DEST='/opt/oracle/oradata' scope=both;

System wurde geändert.

SQL> alter system set DB_CREATE_ONLINE_LOG_DEST_1='/opt/oracle/oradata' scope=both;

System wurde geändert.

Refresh the online log

We will create new OMF redo log files as described below.

The current redo log configuration :

SQL> select v$log.group#, member, v$log.status from v$logfile, v$log where v$logfile.group#=v$log.group#;

    GROUP# MEMBER                                             STATUS
---------- -------------------------------------------------- ----------
        12 /opt/oracle/oradata/myDB/redo12.log                  ACTIVE
        13 /opt/oracle/oradata/myDB/redo13.log                  CURRENT
        15 /opt/oracle/oradata/myDB/redo15.log                  INACTIVE
        16 /opt/oracle/oradata/myDB/redo16.log                  INACTIVE
         1 /opt/oracle/oradata/myDB/redo1.log                   INACTIVE
         2 /opt/oracle/oradata/myDB/redo2.log                   INACTIVE
        17 /opt/oracle/oradata/myDB/redo17.log                  INACTIVE
        18 /opt/oracle/oradata/myDB/redo18.log                  INACTIVE
        19 /opt/oracle/oradata/myDB/redo19.log                  INACTIVE
        20 /opt/oracle/oradata/myDB/redo20.log                  INACTIVE
         3 /opt/oracle/oradata/myDB/redo3.log                   INACTIVE
         4 /opt/oracle/oradata/myDB/redo4.log                   INACTIVE
         5 /opt/oracle/oradata/myDB/redo5.log                   INACTIVE
         6 /opt/oracle/oradata/myDB/redo6.log                   INACTIVE
         7 /opt/oracle/oradata/myDB/redo7.log                   INACTIVE
         8 /opt/oracle/oradata/myDB/redo8.log                   ACTIVE
         9 /opt/oracle/oradata/myDB/redo9.log                   ACTIVE
        10 /opt/oracle/oradata/myDB/redo10.log                  ACTIVE
        11 /opt/oracle/oradata/myDB/redo11.log                  ACTIVE
        14 /opt/oracle/oradata/myDB/redo14.log                  INACTIVE

For all INACTIVE redo log groups, we will be able to drop the group and create it again with the OMF naming convention :

SQL> alter database drop logfile group 1;

Datenbank wurde geändert.

SQL> alter database add logfile group 1;

Datenbank wurde geändert.

In order to move to the next redo group and release the current one, we will run a switch log file :

SQL> alter system switch logfile;

System wurde geändert.

To move the ACTIVE redo log to INACTIVE we will run a checkpoint :

SQL> alter system checkpoint;

System wurde geändert.

And then drop and recreate the last INACTIVE redo groups :

SQL> alter database drop logfile group 10;

Datenbank wurde geändert.

SQL> alter database add logfile group 10;

Datenbank wurde geändert.

To finally have all our online log with OMF format :

SQL> select v$log.group#, member, v$log.status from v$logfile, v$log where v$logfile.group#=v$log.group# order by group#;

    GROUP# MEMBER                                                       STATUS
---------- ------------------------------------------------------------ ----------
         1 /opt/oracle/oradata/myDB/onlinelog/o1_mf_1_ggqx5zon_.log       INACTIVE
         2 /opt/oracle/oradata/myDB/onlinelog/o1_mf_2_ggqxjky2_.log       INACTIVE
         3 /opt/oracle/oradata/myDB/onlinelog/o1_mf_3_ggqxjodl_.log       INACTIVE
         4 /opt/oracle/oradata/myDB/onlinelog/o1_mf_4_ggqxkddc_.log       INACTIVE
         5 /opt/oracle/oradata/myDB/onlinelog/o1_mf_5_ggqxkj1t_.log       INACTIVE
         6 /opt/oracle/oradata/myDB/onlinelog/o1_mf_6_ggqxkmnm_.log       CURRENT
         7 /opt/oracle/oradata/myDB/onlinelog/o1_mf_7_ggqxn373_.log       UNUSED
         8 /opt/oracle/oradata/myDB/onlinelog/o1_mf_8_ggqxn7b3_.log       UNUSED
         9 /opt/oracle/oradata/myDB/onlinelog/o1_mf_9_ggqxnbxd_.log       UNUSED
        10 /opt/oracle/oradata/myDB/onlinelog/o1_mf_10_ggqxvlbf_.log      UNUSED
        11 /opt/oracle/oradata/myDB/onlinelog/o1_mf_11_ggqxvnyg_.log      UNUSED
        12 /opt/oracle/oradata/myDB/onlinelog/o1_mf_12_ggqxvqyp_.log      UNUSED
        13 /opt/oracle/oradata/myDB/onlinelog/o1_mf_13_ggqxvv2o_.log      UNUSED
        14 /opt/oracle/oradata/myDB/onlinelog/o1_mf_14_ggqxxcq7_.log      UNUSED
        15 /opt/oracle/oradata/myDB/onlinelog/o1_mf_15_ggqxxgfg_.log      UNUSED
        16 /opt/oracle/oradata/myDB/onlinelog/o1_mf_16_ggqxxk67_.log      UNUSED
        17 /opt/oracle/oradata/myDB/onlinelog/o1_mf_17_ggqxypwg_.log      UNUSED
        18 /opt/oracle/oradata/myDB/onlinelog/o1_mf_18_ggqy1z78_.log      UNUSED
        19 /opt/oracle/oradata/myDB/onlinelog/o1_mf_19_ggqy2270_.log      UNUSED
        20 /opt/oracle/oradata/myDB/onlinelog/o1_mf_20_ggqy26bj_.log      UNUSED

20 Zeilen ausgewählt.

Refresh temporary file

The database was using 2 temp tablespaces : TEMP and MyDB_BI_TEMP.

We first need to add new temp files in OMF format for both tablespaces.

SQL> alter tablespace TEMP add tempfile size 20G;

Tablespace wurde geändert.

SQL> alter tablespace myDB_BI_TEMP add tempfile size 20G;

Tablespace wurde geändert.

Both tablespace will now include 2 files : a previous non-OMF one and a new OMF one :

SQL> @qdbstbsinf.sql
Enter a tablespace name filter (US%): TEMP

TABLESPACE_NAME      FILE_NAME                                                    STATUS             SIZE_MB AUTOEXTENSIB MAXSIZE_MB
-------------------- ------------------------------------------------------------ --------------- ---------- ------------ ----------
TEMP                 /opt/oracle/oradata/myDB/datafile/o1_mf_temp_ggrjzm9o_.tmp     ONLINE               20480 NO                    0
TEMP                 /usr/oracle/oradata/myDB/temp01.dbf                            ONLINE               20480 NO                    0

SQL> @qdbstbsinf.sql
Enter a tablespace name filter (US%): myDB_BI_TEMP

TABLESPACE_NAME      FILE_NAME                                                    STATUS             SIZE_MB AUTOEXTENSIB MAXSIZE_MB
-------------------- ------------------------------------------------------------ --------------- ---------- ------------ ----------
myDB_BI_TEMP           /opt/oracle/oradata/myDB/datafile/o1_mf_lx_bi_te_ggrk0wxz_.tmp ONLINE               20480 NO                    0
myDB_BI_TEMP           /usr/oracle/oradata/myDB/lx_bi_temp01.dbf                      ONLINE               20480 YES                5120

Dropping temporary file will end into error :

SQL> alter database tempfile '/usr/oracle/oradata/myDB/temp01.dbf' drop including datafiles;
alter database tempfile '/usr/oracle/oradata/myDB/temp01.dbf' drop including datafiles
*
FEHLER in Zeile 1:
ORA-25152: TEMPFILE kann momentan nicht gelöscht werden

We need to restart the database. This will only be possible during the maintenance windows scheduled to run the switchover.

SQL> shutdown immediate;
Datenbank geschlossen.
Datenbank dismounted.
ORACLE-Instanz heruntergefahren.

SQL> startup
ORACLE-Instanz hochgefahren.

Total System Global Area 5,3687E+10 bytes
Fixed Size                 26330584 bytes
Variable Size            3,3152E+10 bytes
Database Buffers         2,0401E+10 bytes
Redo Buffers              107884544 bytes
Datenbank mounted.
Datenbank geöffnet.

The previous non-OMF temporary file can now be deleted :

SQL>  alter database tempfile '/usr/oracle/oradata/myDB/temp01.dbf' drop including datafiles;

Datenbank wurde geändert.

SQL> alter database tempfile '/usr/oracle/oradata/myDB/lx_bi_temp01.dbf' drop including datafiles;

Datenbank wurde geändert.

And we only have OMF temporary files now :

SQL>  @qdbstbsinf.sql
Enter a tablespace name filter (US%): TEMP

TABLESPACE_NAME      FILE_NAME                                                    STATUS             SIZE_MB AUTOEXTENSIB MAXSIZE_MB
-------------------- ------------------------------------------------------------ --------------- ---------- ------------ ----------
TEMP                 /opt/oracle/oradata/myDB/datafile/o1_mf_temp_ggrjzm9o_.tmp     ONLINE               20480 NO                    0

SQL>  @qdbstbsinf.sql
Enter a tablespace name filter (US%): myDB_BI_TEMP

TABLESPACE_NAME      FILE_NAME                                                    STATUS             SIZE_MB AUTOEXTENSIB MAXSIZE_MB
-------------------- ------------------------------------------------------------ --------------- ---------- ------------ ----------
myDB_BI_TEMP           /opt/oracle/oradata/myDB/datafile/o1_mf_lx_bi_te_ggrk0wxz_.tmp ONLINE               20480 NO                    0

Testing switchover

We are now ready to test the switchover from current srv02 primary to ODA03 server after making sure both databases are synchronized.

oracle@srv02:/u01/app/dbvisit/standby/ [MyDB] ./dbvctl -d MyDBSTD1 -o switchover
=============================================================
Dbvisit Standby Database Technology (8.0.26_0_g3fdeaadd) (pid 12196)
dbvctl started on srv02: Tue May 28 00:07:34 2019
=============================================================

>>> Starting Switchover between srv02 and ODA03

Running pre-checks       ... done
Pre processing           ... done
Processing primary       ... done
Processing standby       ... done
Converting standby       ... done
Converting primary       ... done
Completing               ... done
Synchronizing            ... done
Post processing          ... done

>>> Graceful switchover completed.
    Primary Database Server: ODA03
    Standby Database Server: srv02

>>> Dbvisit Standby can be run as per normal:
    dbvctl -d MyDBSTD1


PID:12196
TRACE:12196_dbvctl_switchover_MyDBSTD1_201905280007.trc

=============================================================
dbvctl ended on srv02: Tue May 28 00:13:31 2019
=============================================================

Conclusion

With dbvisit standby it is possible to mix non-OMF and OMF databases after completing several manual steps. The final recommendation would be to run a unique configuration. This is why, after having run a switchover to the new ODA03 database, and making sure the new database is stable, we created from scratch the old primary srv02 database with OMF configuration. Converting a database to OMF using move option is not possible with standard edition.

Cet article Adding a dbvisit standby database on the ODA in a non-OMF environment est apparu en premier sur Blog dbi services.

AEM Forms –“2-way-SSL” Setup and Workbench configuration

$
0
0

In the past two years almost, I have been working with AEM (Adobe Experience Manager) Forms. The road taken by this project was full of problem because of security constraints that AEM has/had big trouble dealing with. In this blog, I will talk about one security aspect which brings some trouble: how to setup and use the “2-way-SSL” (I will describe below why I put that in quote) for the AEM Workbench.

I have been using AEM Forms 6.4.0 initially (20180228) with its associated Workbench version. I will consider that the AEM Forms has been installed already and is working properly. In this case, I used AEM Forms on a WebLogic Server (12.2) which I configured in HTTPS. So once you have that, what do you need to do to configure and use the AEM Workbench with “2-way-SSL”? Well first, let’s ensure that the AEM Workbench is working properly and then start with the setup.

Open the AEM Workbench and configure a new “Server”:

  • Open the AEM Workbench (run the workbench.exe file)
  • Click on “File > Login
  • Click on “Configure...”
  • Click on the “+” sign to add a new Server
    • Set the Server Title to: <AEM_HOST> – SimpleAuth
    • Set the Hostname to: <AEM_HOST>
    • Set the Protocol to: Simple Object Access Protocol (SOAP/HTTPs)
    • Set the Server Port Number to: <AEM_PORT>
    • Click on “OK
  • Click on “OK
  • Set the Log on to the newly created Server (“<AEM_HOST> – SimpleAuth“)
  • Set the Username to: administrator (or whatever other account you have)
  • Set the Password for this account
  • Click on “Login

Workbench login 1-way-SSL

If everything was done properly, the login should be working. The next step is to configure AEM for the “2-way-SSL” communications. As mentioned at the beginning of this blog, I put that in quote because it’s a 2-way-SSL but there is one security layer that is bypassed when doing that. With the AEM Workbench in 1-way-SSL, you need to enter a username and a credential. Adding a 2-way-SSL instead would normally just add another layer of security where the server and client will exchange their certificate and will trust each other but the user’s authentication is still needed!

In the case of the AEM Workbench, the “2-way-SSL” setup actually completely bypass the user’s authentication and therefore I do not really consider that as a real 2-way-SSL setup… It might even be considered as a security issue (it’s a shame for a feature that is supposed to increase security) because, as you will see below, as soon as you have the Client SSL Certificate (and its password obviously), then you will be able to access AEM Workbench. So protect this certificate with great care.

To configure the AEM, you will then need to create an Hybrid Domain:

  • Open the AEM AdminUI (https://<AEM_HOST>:<AEM_PORT>/adminui)
  • Login with the administrator account (or whatever other account you have)
  • Navigate to: Settings > User Management > Domain Management
  • Click on “New Hybrid Domain
    • Set the ID to: SSLMutualAuthProvider
    • Set the Name to: SSLMutualAuthProvider
    • Check the “Enable Account Locking” checkbox
    • Uncheck the “Enable Just In Time Provisioning” checkbox
    • Click on “Add Authentication
      • Set the “Authentication Provider” to: Custom
      • Check the “SSLMutualAuthProvider” checkbox
      • Click on the “OK
    • Click on the “OK

Note: If “SSLMutualAuthProvider” isn’t available on the Authentication page, then please check this blog.

Hybrid Domain 1

Hybrid Domain 2

Hybrid Domain 3

Then you will need to create a user. In this example, I will use a generic account but it is possible to have several accounts for each of your devs for example, in which case each user must have their own SSL Certificate. The user Canonical Name and ID must absolutely match the CN used to generate the SSL Certificate that the Client will use. So if you generated an SSL Certificate for the Client with “/C=CH/ST=Jura/L=Delemont/O=dbi services/OU=IT/CN=aem-dev“, then the Canonical Name and ID to be used for the user in AEM should be “aem-dev“:

  • Navigate to: Settings > User Management > Users and Groups
  • Click on “New User
  • On the New User (Step 1 of 3) screen:
    • Uncheck the “System Generated” checkbox
    • Set the Canonical Name to: <USER_CN>
    • Set the First Name to: 2-way-SSL
    • Set the Last Name to: User
    • Set the Domain to: SSLMutualAuthProvider
    • Set the User Id to: <USER_CN>
    • Click on “Next
  • On the New User: 2-way-SSL (Step 2 of 3) screen:
    • Click on “Next
  • On the New User: 2-way-SSL (Step 3 of 3) screen:
    • Click on “Find Roles
      • Check the checkbox for the Role Name: Application Administrator (or any other valid role that you want this user to be able to use)
      • Click on the “OK” button
  • Click on the “Finish” button

User 1

User 2

User 3

At this point, you can configure your Application Server to handle the 2-way-SSL communications. In WebLogic Server, this is done by setting the “Two Way Client Cert Behavior” to “Client Certs Requested and Enforced” in the SSL subtab of the Managed Server(s) hosting the AEM Forms applications.

Finally the last step is to get back to the AEM Workbench and try your 2-way-SSL communications. If you try again to use the SimpleAuth that we defined above, it should fail because the Application Server will require the Client SSL Certificate, which isn’t provided in this case. So let’s create a new “Server”:

  • Click on “File > Login
  • Click on “Configure...”
  • Click on the “+” sign to add a new Server
    • Set the Server Title to: <AEM_HOST> – MutualAuth
    • Set the Hostname to: <AEM_HOST>
    • Set the Protocol to: Simple Object Access Protocol (SOAP/HTTPs) Mutual Auth
    • Set the Server Port Number to: <AEM_PORT>
    • Click on “OK
  • Click on “OK
  • Set the Log on to the newly created Server (“<AEM_HOST> – MutualAuth“)
  • Set the Key Store to: file:C:\Users\Morgan\Documents\AEM_Workbench\aem-dev.jks (Adapt to wherever you put the keystore)
  • Set the Key Store Password to: <KEYSTORE_PWD>
  • Set the Trust Store to: file:C:\Users\Morgan\Documents\AEM_Workbench\trust.jks (Adapt to wherever you put the truststore)
  • Set the Trust Store Password to: <TRUSTSTORE_PWD>
  • Click on “Login

Workbench login 2-way-SSL

In the above login screen, the KeyStore is the SSL Certificate that was created for the Client and the TrustStore will be used to validate/trust the SSL Certificate of the AEM Server. It can be the cacerts from the AEM Workbench for example. If you are using a Self-Signed SSL Certificate, don’t forget to add the Trust Chain into the TrustStore.

Cet article AEM Forms – “2-way-SSL” Setup and Workbench configuration est apparu en premier sur Blog dbi services.

AEM Forms – No SSLMutualAuthProvider available

$
0
0

In the process of setting up the AEM Workbench to use 2-way-SSL, you will need at some point to use a Hybrid Domain and a specific Authentication Provider. Depending on the version of the AEM that you are using, this Authentication Provider might not be present and therefore you will never be able to set that up properly. In this blog, I will describe what was done in our case to solve this problem.

The first time we tried to set that up (WebLogic Server 12.2, AEM 6.4.0), it just wasn’t working. Therefore, we opened a case with the Adobe Support and after quite some time, we found out that the documentation was not complete (#CQDOC-13273) and that there were actually missing steps and missing configuration inside the AEM to allow the 2-way-SSL to work. So basically everything said that the 2-way-SSL was possible but there were just missing pieces inside AEM to have it really working. Therefore after discussion & investigation with the Adobe Support Engineers (#NPR-26490), they provided us the missing piece: adobe-usermanager-ssl-dsc.jar.

When you install AEM Forms, it will automatically deploy a bunch of DSC (jar file) to provide all features of the AEM Forms. These are a few examples:

  • adobe-pdfservices-dsc.jar
  • adobe-usermanager-dsc.jar
  • adobe-jobmanager-dsc.jar
  • adobe-scheduler-weblogic-dsc.jar

Therefore, our AEM Forms version at that time (mid-2018, AEM 6.4.0) was missing one of these DSC and it was the root cause of our issue. So what can you do fix that? Well you just have deploy it and since we are anyway in the middle of working with the AEM Workbench to set it up with 2-way-SSL, that’s perfect. While the Workbench is still able to use 1-way-SSL (don’t set your Application Server in 2-way-SSL or revert it to 1-way-SSL):

  • Download or request the file “adobe-usermanager-ssl-dsc.jar” for your AEM version to the Adobe Support
  • Open the AEM Workbench (run the workbench.exe file)
  • Click on “File > Login
  • Set the Log on to to: <AEM_HOST> – SimpleAuth (or whatever the name of your SimpleAuth is)
  • Set the Username to: administrator (or whatever other account you have)
  • Set the Password for this account
  • Click on “Login
  • Click on “Window > Show View > Components
  • The Components window should be opened (if not already done before) somewhere on the screen (most probably on the left side)
  • Inside the Components window, right click on the “Components” folder and select “Install Component …
  • Find the file “adobe-usermanager-ssl-dsc.jar” that has been downloaded earlier, select it and click on “Open
  • Right click on the “Components” folder and select “Refresh
  • Expand the “Components” folder (if not already done), and look for the component named “SSLAuthProvider
  • If this component isn’t started yet (there is a red square on the package), then start it using the following steps:
    • Right click on “SSLAuthProvider
    • Select “Start Component

Note: If the “SSLAuthProvider” component already exists, then you will see an error. This is fine, it just needs to be there and to be started/running. If this is the case then it’s all good.

Workbench - Open components

Workbench - Refresh components

Workbench - Start component

Once the SSLAuthProvider DSC has been installed and is running, you should be able to see the SSLMutualAuthProvider in the list of custom providers while creating the Hybrid Domain on the AdminUI. Adobe was normally supposed to fix this in the following releases but I didn’t get the opportunity to test the installation of AEM 6.5 from scratch yet. If you have this information, don’t hesitate to share!

Cet article AEM Forms – No SSLMutualAuthProvider available est apparu en premier sur Blog dbi services.

PostgreSQL check_function_bodies, what is it good for?

$
0
0

One of the probably lesser known PostgreSQL parameters is check_function_bodies. If you know Oracle, then you for sure faced “invalid objects” a lot. In PostgreSQL, by default, there is nothing like an invalid object. That implies that you can not create a function or procedure which references an object that does not yet exist.

Lets assume you want to create a function like this, but the table “t1” does not exist:

postgres=# create or replace function f1 () returns setof t1 as
$$
select * from t1;
$$ language 'sql';
ERROR:  type "t1" does not exist

PostgreSQL will not create the function as a dependent objects does not exist. Once the table is there the function will be created:

postgres=# create table t1 ( a int );
CREATE TABLE
postgres=# insert into t1 values(1);
INSERT 0 1
postgres=# create or replace function f1 () returns setof t1 as
$$
select * from t1;
$$ language 'sql';
CREATE FUNCTION
postgres=# select * from f1();
a
---
1

The issue with that is, that you need to follow the order in which functions gets created. Especially when you need to load functions for other users that can easily become tricky and time consuming. This is where check_function_bodies helps:

postgres=# set check_function_bodies = false;
SET
postgres=# create or replace function f2 () returns setof t1 as
$$
select * from t2;
$$ language 'sql';
CREATE FUNCTION

The function was created although t2 did not exist. Executing the function right now of course will generate an error:

postgres=# select * from f2();
ERROR:  relation "t2" does not exist
LINE 2: select * from t2;
^
QUERY:
select * from t2;

CONTEXT:  SQL function "f2" during startup

Once the table is there all is fine:

postgres=# create table t2 ( a int );
CREATE TABLE
postgres=# insert into t2 values (2);
INSERT 0 1
postgres=# select * from f2();
a
---
2

This is very helpful when loading objects provided by an external vendor. pg_dump is doing that by default.

Cet article PostgreSQL check_function_bodies, what is it good for? est apparu en premier sur Blog dbi services.


Provisioning a AKS cluster and KubeInvaders with Terraform

$
0
0

Provisioning a K8s infrastructure may be performed in different ways. Terraform has a connector called the Kubernetes provider but it doesn’t allow building and deploying a Kubernetes cluster. The cluster must be up and running before using the provider. Fortunately, there are different cloud-specific provider depending which cloud provider you want to provision your cluster.

In our CI pipeline for the MSSQL DMK maintenance, we provision SQL Server containers on Linux to perform then different tests. Our K8s infrastructure is managed by Azure through AKS and the main issue we have so far is that the cluster must exist before deploying containers. In other hand, the period of testing is unpredictable and depends mainly on the team availability. The first approach was to leave the AKS cluster up and running all the time to avoid breaking the CI pipeline, but the main drawback is obviously the cost. One solution is to use Terraform provider for AKS through our DevOps Azure pipeline.

But in this blog let’s start funny by provisioning the AKS infrastructure to host the KubeInvaders project. KubeInvaders is a funny way to explain different components of K8s and I will use it during my next Workshop SQL Server on Kubernetes at SQLSaturday Lisbon on November 29-30th 2019. In addition, deploying this project on AKS requires additional components including an Ingress controller, a cert manager and issuer to connect to KubeInvaders software from outside. In turn, those components are deployed through helm charts meaning we also must install helm and tiller (if you use helm version < 3).

Before sharing my code, let’s say there is already a plenty of blogs that explains how to build a Terraform plan to deploy an AKS cluster and helm components. Therefore, I prefer to share my notes and issues I experienced during my work.

1) In my context, I already manage an another AKS cluster from my laptop and I spent some times to understand the Kubernetes provider always first tries to load a config file from a given (or default) location as stated to the Terraform documentation. As a result, I experienced some weird behaviors when computing Terraform plan with detection of some components like service accounts that are supposed to not exist yet. In fact, the provider loaded my default config file ($HOME/.kube/config) was tied to another existing AKS cluster.

2) I had to take care of module execution order to get a consistent result. Terraform module dependencies can be implicit or explicit (controlled by the depends_on clause).

3) The “local-exec” provisioner remains useful to do additional work that cannot be managed directly by a Terraform module. It was especially helpful to update some KubeInvaders files with the new fresh cluster connection info as well as the URL to reach KubeInvaders software.

4) The tiller is installing only after deploying a first helm chart

The components are deployed as follows:

  • Create Azure AKS resource group
  • Create AKS cluster
  • Load Kubernetes provider with newly created AKS cluster credentials
  • Create helm tiller service account
  • Bind helm tiller service account with cluster-admin role (can be improved if production scenario)
  • Save new cluster config into azure_config file for future connections from other CLI tools
  • Create new namespaces for Ingress controller, cert manager and KubeInvaders
  • Load helm provider with newly created AKS cluster credentials
  • Deploy Ingress controller and configuration
  • Deploy cert manager and cluster issuer and configuration
  • Install KubeInvaders

For the cert manager and cluster issuer, in order to create an ingress controller with a static public address and TLS certificates, I referred to the Microsoft documentation .

For KuveInvaders, I just cloned the project in the same directory than my Terraform files and I used the deployment file for Kubernetes.

Here my bash command file used to save my current kubeconfig file before running the Terraform stuff:

#!/bin/bash

# Remove all K8s contextes before applying terraform
read -p "Press [Enter] to save current K8s contextes ..."

export KUBECONFIG=

if [ -f "$HOME/.kube/config" ] 
then 
    mv $HOME/.kube/config $HOME/.kube/config.sav
fi

# Generate terraform plan
read -p "Press [Enter] to generate plan ..."
terraform plan -out out.plan

# Apply terraform plan
read -p "Press [Enter] to apply plan ..."
terraform apply out.plan

# Restore saved K8s contextes
read -p "Press [Enter] to restore default K8s contextes ..."
if [ -f "$HOME/.kube/config.sav" ]; then mv $HOME/.kube/config.sav $HOME/.kube/config; fi

# Load new context
export KUBECONFIG=./.kube/azure_config

 

Here my Terraform files:

  •  variables.tf
variable "client_id" {}
variable "client_secret" {}

variable "agent_count" {
    default = 3
}

variable "ssh_public_key" {
    default = "~/.ssh/id_rsa.pub"
}

variable "dns_prefix" {
    default = "aksci"
}

variable cluster_name {
    default = "aksci"
}

variable resource_group_name {
    default = "aks-grp"
}

variable location {
    default = "westeurope"
}

variable domain_name_label {
    default = "xxxx-ingress"
}

variable letsencrypt_email_address {
    default = "user@contoso.com"
}

variable letsencrypt_environment {
    default = "letsencrypt-staging"
}

 

  •  K8s_main.tf
###############################################
#     Create Azure Resource Group for AKS     #
###############################################   
resource "azurerm_resource_group" "aks-ci" {
    name     = "${var.resource_group_name}"
    location = "${var.location}"

    tags = {
        Environment = "CI"
    }
}

###############################################
#             Create AKS cluster              #
###############################################  
resource "azurerm_kubernetes_cluster" "aks-ci" {

    name                = "${var.cluster_name}"
    location            = "${azurerm_resource_group.aks-ci.location}"
    resource_group_name = "${azurerm_resource_group.aks-ci.name}"
    dns_prefix          = "${var.dns_prefix}"
    kubernetes_version  = "1.14.6"

    role_based_access_control {
        enabled = true
    }

    linux_profile {
        admin_username = "clustadmin"

        ssh_key {
                key_data = "${file("${var.ssh_public_key}")}"
        }
    }
 
    agent_pool_profile {
        name            = "agentpool"
        count           = "${var.agent_count}"
        vm_size         = "Standard_DS2_v2"
        os_type         = "Linux"
        os_disk_size_gb = 30
    }

    addon_profile {
        kube_dashboard {
            enabled = true
        }
    }
 
    service_principal {
        client_id     = "${var.client_id}"
        client_secret = "${var.client_secret}"
    }
 
    tags = {
        Environment = "CI"
    }
}

###############################################
#            Load Provider K8s                #
###############################################  

provider "kubernetes" {
    host                   = "${azurerm_kubernetes_cluster.aks-ci.kube_config.0.host}"
    client_certificate     = "${base64decode(azurerm_kubernetes_cluster.aks-ci.kube_config.0.client_certificate)}"
    client_key             = "${base64decode(azurerm_kubernetes_cluster.aks-ci.kube_config.0.client_key)}"
    cluster_ca_certificate = "${base64decode(azurerm_kubernetes_cluster.aks-ci.kube_config.0.cluster_ca_certificate)}"
    alias                  = "aks-ci"
}

###############################################
#       Create tiller service account         #
###############################################  
resource "kubernetes_service_account" "tiller" {
    provider = "kubernetes.aks-ci"
    
    metadata {
        name      = "tiller"
        namespace = "kube-system"
    }
     
    automount_service_account_token = true

    depends_on = [ "azurerm_kubernetes_cluster.aks-ci" ]
}

###############################################
#     Create tiller cluster role binding      #
############################################### 
 resource "kubernetes_cluster_role_binding" "tiller" {
    provider = "kubernetes.aks-ci"

    metadata {
        name = "tiller"
    }

    role_ref {
         kind      = "ClusterRole"
         name      = "cluster-admin"
         api_group = "rbac.authorization.k8s.io"
    }

    subject {
        kind      = "ServiceAccount"
        name      = "${kubernetes_service_account.tiller.metadata.0.name}"
        api_group = ""
        namespace = "kube-system"
    }

    depends_on = ["kubernetes_service_account.tiller"]
 }

###############################################
#   Save kube-config into azure_config file   #
###############################################
resource "null_resource" "save-kube-config" {

    triggers = {
        config = "${azurerm_kubernetes_cluster.aks-ci.kube_config_raw}"
    }
    provisioner "local-exec" {
        command = "mkdir -p ${path.module}/.kube && echo '${azurerm_kubernetes_cluster.aks-ci.kube_config_raw}' > ${path.module}/.kube/azure_config && chmod 0600 ${path.module}/.kube/azure_config"
    }

    depends_on = [ "azurerm_kubernetes_cluster.aks-ci" ]
}

###############################################
#    Create Azure public IP and DNS for       #
#    Azure load balancer                      #
###############################################
resource "azurerm_public_ip" "nginx_ingress" {
  
    name                = "nginx_ingress-ip"
    location            = "WestEurope"
    resource_group_name = "${azurerm_kubernetes_cluster.aks-ci.node_resource_group}" #"${azurerm_resource_group.aks-ci.name}"
    allocation_method   = "Static"
    domain_name_label   = "${var.domain_name_label}"

    tags = {
        environment = "CI"
    }

    depends_on = [ "azurerm_kubernetes_cluster.aks-ci" ]
}

###############################################
#       Create namespace nginx_ingress        #
###############################################
resource "kubernetes_namespace" "nginx_ingress" {
    provider = "kubernetes.aks-ci"

    metadata {
        name = "ingress-basic"
    } 

    depends_on = [ "azurerm_kubernetes_cluster.aks-ci" ]
}

###############################################
#       Create namespace cert-manager         #
###############################################
resource "kubernetes_namespace" "cert-manager" {
    provider = "kubernetes.aks-ci"

    metadata {
        name = "cert-manager"
    } 

    depends_on = [ "azurerm_kubernetes_cluster.aks-ci" ]
}

###############################################
#       Create namespace kubeinvaders         #
###############################################
resource "kubernetes_namespace" "kubeinvaders" {
    provider = "kubernetes.aks-ci"
    
    metadata {
        name = "foobar"
    } 
    
    depends_on = [ "azurerm_kubernetes_cluster.aks-ci" ]
}

###############################################
#             Load Provider helm              #
###############################################

provider "helm" {
    install_tiller  = true
    service_account = "${kubernetes_service_account.tiller.metadata.0.name}"

    kubernetes {
        host                   = "${azurerm_kubernetes_cluster.aks-ci.kube_config.0.host}"
        client_certificate     = "${base64decode(azurerm_kubernetes_cluster.aks-ci.kube_config.0.client_certificate)}"
        client_key             = "${base64decode(azurerm_kubernetes_cluster.aks-ci.kube_config.0.client_key)}"
        cluster_ca_certificate = "${base64decode(azurerm_kubernetes_cluster.aks-ci.kube_config.0.cluster_ca_certificate)}"
    }

}

###############################################
#        Load helm stable repository          #
###############################################
data "helm_repository" "stable" {
  name = "stable"
  url  = "https://kubernetes-charts.storage.googleapis.com"
}

###############################################
#       Install nginx ingress controller      #
###############################################
resource "helm_release" "nginx_ingress" {

     name       = "nginx-ingress"
     repository = "${data.helm_repository.stable.metadata.0.name}"
     chart      = "nginx-ingress"
     timeout    = 2400
     namespace  = "${kubernetes_namespace.nginx_ingress.metadata.0.name}"

     set {
         name  = "controller.replicaCount"
         value = "1"
     }
     set {
         name  = "controller.service.loadBalancerIP"
         value = "${azurerm_public_ip.nginx_ingress.ip_address}"
     }
     set_string {
         name  = "service.beta.kubernetes.io/azure-load-balancer-resource-group"
         value = "${azurerm_resource_group.aks-ci.name}"
     }

    depends_on = [ "kubernetes_cluster_role_binding.tiller" ]
}

###############################################
#       Install and configure cert_manager    #
###############################################
resource "helm_release" "cert_manager" {
    keyring = ""
    name = "cert-manager"
    chart = "stable/cert-manager"
    namespace = "kube-system"
    
    depends_on = ["helm_release.nginx_ingress"]
    
    set {
         name  = "webhook.enabled"
         value = "false"
    }

    provisioner "local-exec" {
        command = "kubectl --kubeconfig=${path.module}/.kube/azure_config apply -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.6/deploy/manifests/00-crds.yaml"
    }
    provisioner "local-exec" {
        command = "kubectl --kubeconfig=${path.module}/.kube/azure_config  label namespace kube-system certmanager.k8s.io/disable-validation=\"true\" --overwrite"
    }
    provisioner "local-exec" {
        command = "kubectl --kubeconfig=${path.module}/.kube/azure_config create -f ${path.module}/cluster_issuer/cluster-issuer.yaml"
    }
}

###############################################
#            Install Kubeinvaders             #
###############################################
resource "null_resource" "kubeinvaders" {
    triggers = {
        build_number = "${timestamp()}"
        config = "${azurerm_kubernetes_cluster.aks-ci.kube_config_raw}"
    }
    
    depends_on = ["helm_release.nginx_ingress","helm_release.cert_manager"]
    provisioner "local-exec" {
        command = "sed -i \"0,/ROUTE_HOST=.*/ROUTE_HOST=${var.domain_name_label}.westeurope.cloudapp.azure.com/\" ${path.module}/KubeInvaders/deploy_kubeinvaders.sh"
    }
    provisioner "local-exec" {
        command = "sed \"s/||toto||/${var.domain_name_label}.westeurope.cloudapp.azure.com/\" ${path.module}/KubeInvaders/kubernetes/kubeinvaders-ingress.template > ${path.module}/KubeInvaders/kubernetes/kubeinvaders-ingress.yml"
    }
    provisioner "local-exec" {
        command = "cd ${path.module}/KubeInvaders && ./deploy_kubeinvaders.sh"
    }
}

 

  •  output.tf
output "client_key" {
    value = "${azurerm_kubernetes_cluster.aks-ci.kube_config.0.client_key}"
}

output "client_certificate" {
    value = "${azurerm_kubernetes_cluster.aks-ci.kube_config.0.client_certificate}"
}

output "cluster_ca_certificate" {
    value = "${azurerm_kubernetes_cluster.aks-ci.kube_config.0.cluster_ca_certificate}"
}

output "cluster_username" {
    value = "${azurerm_kubernetes_cluster.aks-ci.kube_config.0.username}"
}

output "cluster_password" {
    value = "${azurerm_kubernetes_cluster.aks-ci.kube_config.0.password}"
}

output "kube_config" {
    value = "${azurerm_kubernetes_cluster.aks-ci.kube_config_raw}"
}

output "host" {
    value = "${azurerm_kubernetes_cluster.aks-ci.kube_config.0.host}"
}

 

After provisioning my AKS cluster through Terraform here the funny result … 🙂

Et voilà! Next time, I will continue with a write-up about provisioning an AKS cluster through DevOps Azure and terraform module.

See you!

 

 

 

 

Cet article Provisioning a AKS cluster and KubeInvaders with Terraform est apparu en premier sur Blog dbi services.

PostgreSQL 13 will come with partitioning support for pgbench

$
0
0

A lot of people use pgbench to benchmark a PostgreSQL instance and pgbench is also heavily used by the PostgreSQL developers. While declarative partitioning was introduced in PostgreSQL 10 there was no support for that in pgbench, even in the current version, which is PostgreSQL 12. With PostgreSQL 13, which is currently in development, this will change and pgbench will be able to create a partitioned pgbench_accounts tables you then can run your benchmark against.

Having a look at the parameters of pgbench in PostgreSQL 13, two new ones pop up:

postgres@centos8pg:/home/postgres/ [pgdev] pgbench --help
pgbench is a benchmarking tool for PostgreSQL.

Usage:
pgbench [OPTION]... [DBNAME]

Initialization options:
-i, --initialize         invokes initialization mode
-I, --init-steps=[dtgvpf]+ (default "dtgvp")
run selected initialization steps
-F, --fillfactor=NUM     set fill factor
-n, --no-vacuum          do not run VACUUM during initialization
-q, --quiet              quiet logging (one message each 5 seconds)
-s, --scale=NUM          scaling factor
--foreign-keys           create foreign key constraints between tables
--index-tablespace=TABLESPACE
create indexes in the specified tablespace
--partitions=NUM         partition pgbench_accounts in NUM parts (default: 0)
--partition-method=(range|hash)
partition pgbench_accounts with this method (default: range)
--tablespace=TABLESPACE  create tables in the specified tablespace
--unlogged-tables        create tables as unlogged tables

Options to select what to run:
-b, --builtin=NAME[@W]   add builtin script NAME weighted at W (default: 1)
(use "-b list" to list available scripts)
-f, --file=FILENAME[@W]  add script FILENAME weighted at W (default: 1)
-N, --skip-some-updates  skip updates of pgbench_tellers and pgbench_branches
(same as "-b simple-update")
-S, --select-only        perform SELECT-only transactions
(same as "-b select-only")

Benchmarking options:
-c, --client=NUM         number of concurrent database clients (default: 1)
-C, --connect            establish new connection for each transaction
-D, --define=VARNAME=VALUE
define variable for use by custom script
-j, --jobs=NUM           number of threads (default: 1)
-l, --log                write transaction times to log file
-L, --latency-limit=NUM  count transactions lasting more than NUM ms as late
-M, --protocol=simple|extended|prepared
protocol for submitting queries (default: simple)
-n, --no-vacuum          do not run VACUUM before tests
-P, --progress=NUM       show thread progress report every NUM seconds
-r, --report-latencies   report average latency per command
-R, --rate=NUM           target rate in transactions per second
-s, --scale=NUM          report this scale factor in output
-t, --transactions=NUM   number of transactions each client runs (default: 10)
-T, --time=NUM           duration of benchmark test in seconds
-v, --vacuum-all         vacuum all four standard tables before tests
--aggregate-interval=NUM aggregate data over NUM seconds
--log-prefix=PREFIX      prefix for transaction time log file
(default: "pgbench_log")
--progress-timestamp     use Unix epoch timestamps for progress
--random-seed=SEED       set random seed ("time", "rand", integer)
--sampling-rate=NUM      fraction of transactions to log (e.g., 0.01 for 1%)
--show-script=NAME       show builtin script code, then exit

Common options:
-d, --debug              print debugging output
-h, --host=HOSTNAME      database server host or socket directory
-p, --port=PORT          database server port number
-U, --username=USERNAME  connect as specified database user
-V, --version            output version information, then exit
-?, --help               show this help, then exit

Report bugs to .

That should give us partitions according to the amount of partitions and partitioning method we chose, so let’s populate a new database:

postgres@centos8pg:/home/postgres/ [pgdev] psql -c "create database pgbench" postgres
CREATE DATABASE
Time: 326.715 ms
postgres@centos8pg:/home/postgres/ [pgdev] pgbench -i -s 10 --partitions=10 --partition-method=range --foreign-keys pgbench
dropping old tables...
NOTICE:  table "pgbench_accounts" does not exist, skipping
NOTICE:  table "pgbench_branches" does not exist, skipping
NOTICE:  table "pgbench_history" does not exist, skipping
NOTICE:  table "pgbench_tellers" does not exist, skipping
creating tables...
creating 10 partitions...
generating data...
100000 of 1000000 tuples (10%) done (elapsed 0.20 s, remaining 1.78 s)
200000 of 1000000 tuples (20%) done (elapsed 0.40 s, remaining 1.62 s)
300000 of 1000000 tuples (30%) done (elapsed 0.74 s, remaining 1.73 s)
400000 of 1000000 tuples (40%) done (elapsed 1.23 s, remaining 1.85 s)
500000 of 1000000 tuples (50%) done (elapsed 1.47 s, remaining 1.47 s)
600000 of 1000000 tuples (60%) done (elapsed 1.81 s, remaining 1.21 s)
700000 of 1000000 tuples (70%) done (elapsed 2.25 s, remaining 0.97 s)
800000 of 1000000 tuples (80%) done (elapsed 2.46 s, remaining 0.62 s)
900000 of 1000000 tuples (90%) done (elapsed 2.81 s, remaining 0.31 s)
1000000 of 1000000 tuples (100%) done (elapsed 3.16 s, remaining 0.00 s)
vacuuming...
creating primary keys...
creating foreign keys...
done in 5.78 s (drop tables 0.00 s, create tables 0.07 s, generate 3.29 s, vacuum 0.84 s, primary keys 0.94 s, foreign keys 0.65 s).

The pgbench_accounts table should now be partitioned by range:

postgres@centos8pg:/home/postgres/ [pgdev] psql -c "\d+ pgbench_accounts" pgbench
Partitioned table "public.pgbench_accounts"
Column  |     Type      | Collation | Nullable | Default | Storage  | Stats target | Description
----------+---------------+-----------+----------+---------+----------+--------------+-------------
aid      | integer       |           | not null |         | plain    |              |
bid      | integer       |           |          |         | plain    |              |
abalance | integer       |           |          |         | plain    |              |
filler   | character(84) |           |          |         | extended |              |
Partition key: RANGE (aid)
Indexes:
"pgbench_accounts_pkey" PRIMARY KEY, btree (aid)
Foreign-key constraints:
"pgbench_accounts_bid_fkey" FOREIGN KEY (bid) REFERENCES pgbench_branches(bid)
Referenced by:
TABLE "pgbench_history" CONSTRAINT "pgbench_history_aid_fkey" FOREIGN KEY (aid) REFERENCES pgbench_accounts(aid)
Partitions: pgbench_accounts_1 FOR VALUES FROM (MINVALUE) TO (100001),
pgbench_accounts_10 FOR VALUES FROM (900001) TO (MAXVALUE),
pgbench_accounts_2 FOR VALUES FROM (100001) TO (200001),
pgbench_accounts_3 FOR VALUES FROM (200001) TO (300001),
pgbench_accounts_4 FOR VALUES FROM (300001) TO (400001),
pgbench_accounts_5 FOR VALUES FROM (400001) TO (500001),
pgbench_accounts_6 FOR VALUES FROM (500001) TO (600001),
pgbench_accounts_7 FOR VALUES FROM (600001) TO (700001),
pgbench_accounts_8 FOR VALUES FROM (700001) TO (800001),
pgbench_accounts_9 FOR VALUES FROM (800001) TO (900001)

The same should work for hash partitioning:

postgres@centos8pg:/home/postgres/ [pgdev] pgbench -i -s 10 --partitions=10 --partition-method=hash --foreign-keys pgbench
dropping old tables...
creating tables...
creating 10 partitions...
generating data...
100000 of 1000000 tuples (10%) done (elapsed 0.19 s, remaining 1.69 s)
200000 of 1000000 tuples (20%) done (elapsed 0.43 s, remaining 1.71 s)
300000 of 1000000 tuples (30%) done (elapsed 0.67 s, remaining 1.55 s)
400000 of 1000000 tuples (40%) done (elapsed 1.03 s, remaining 1.54 s)
500000 of 1000000 tuples (50%) done (elapsed 1.22 s, remaining 1.22 s)
600000 of 1000000 tuples (60%) done (elapsed 1.59 s, remaining 1.06 s)
700000 of 1000000 tuples (70%) done (elapsed 1.80 s, remaining 0.77 s)
800000 of 1000000 tuples (80%) done (elapsed 2.16 s, remaining 0.54 s)
900000 of 1000000 tuples (90%) done (elapsed 2.36 s, remaining 0.26 s)
1000000 of 1000000 tuples (100%) done (elapsed 2.69 s, remaining 0.00 s)
vacuuming...
creating primary keys...
creating foreign keys...
done in 4.99 s (drop tables 0.10 s, create tables 0.08 s, generate 2.74 s, vacuum 0.84 s, primary keys 0.94 s, foreign keys 0.30 s).
postgres@centos8pg:/home/postgres/ [pgdev] psql -c "\d+ pgbench_accounts" pgbench
Partitioned table "public.pgbench_accounts"
Column  |     Type      | Collation | Nullable | Default | Storage  | Stats target | Description
----------+---------------+-----------+----------+---------+----------+--------------+-------------
aid      | integer       |           | not null |         | plain    |              |
bid      | integer       |           |          |         | plain    |              |
abalance | integer       |           |          |         | plain    |              |
filler   | character(84) |           |          |         | extended |              |
Partition key: HASH (aid)
Indexes:
"pgbench_accounts_pkey" PRIMARY KEY, btree (aid)
Foreign-key constraints:
"pgbench_accounts_bid_fkey" FOREIGN KEY (bid) REFERENCES pgbench_branches(bid)
Referenced by:
TABLE "pgbench_history" CONSTRAINT "pgbench_history_aid_fkey" FOREIGN KEY (aid) REFERENCES pgbench_accounts(aid)
Partitions: pgbench_accounts_1 FOR VALUES WITH (modulus 10, remainder 0),
pgbench_accounts_10 FOR VALUES WITH (modulus 10, remainder 9),
pgbench_accounts_2 FOR VALUES WITH (modulus 10, remainder 1),
pgbench_accounts_3 FOR VALUES WITH (modulus 10, remainder 2),
pgbench_accounts_4 FOR VALUES WITH (modulus 10, remainder 3),
pgbench_accounts_5 FOR VALUES WITH (modulus 10, remainder 4),
pgbench_accounts_6 FOR VALUES WITH (modulus 10, remainder 5),
pgbench_accounts_7 FOR VALUES WITH (modulus 10, remainder 6),
pgbench_accounts_8 FOR VALUES WITH (modulus 10, remainder 7),
pgbench_accounts_9 FOR VALUES WITH (modulus 10, remainder 8).

Looks fine. Now you can easily benchmark against a partitioned pgbench_accounts table.

Cet article PostgreSQL 13 will come with partitioning support for pgbench est apparu en premier sur Blog dbi services.

Patroni Operations – Changing Parameters

$
0
0

Sooner or later all of us have to change a parameter on the database. But how is this put into execution when using a Patroni cluster? Of course there are some specifics you have to consider.
This post will give you a short introduction into this topic.

When you want to change a parameter on a Patroni cluster you have several possibilities:
– Dynamic configuration in DCS. These changes are applied asynchronously to every node.
– Local configuration in patroni.yml. This will take precedence over the dynamic configuration.
– Cluster configuration using “alter system”.
– Environment configuration using local environment variables.

Change PostgreSQL parameters using patronictl

1. Change parameters, that do not need a restart

If you want to change a parameter (or more) for the whole cluster, you should use patronictl. If you want to change the initial configuration as well, you should also adjust patroni.yml.

postgres@patroni1:/u01/app/postgres/local/dmk/etc/ [PG1] patronictl edit-config PG1

All parameters already set are shown and can be changed like in any other file using the vi commands:

postgres@patroni1:/u01/app/postgres/local/dmk/etc/ [PG1] patronictl edit-config PG1

loop_wait: 10
maximum_lag_on_failover: 1048576
postgresql:
  parameters:
    archive_command: /bin/true
    archive_mode: 'on'
    autovacuum_max_workers: '6'
    autovacuum_vacuum_scale_factor: '0.1'
    autovacuum_vacuum_threshold: '50'
    client_min_messages: WARNING
    effective_cache_size: 512MB
    hot_standby: 'on'
    hot_standby_feedback: 'on'
    listen_addresses: '*'
    log_autovacuum_min_duration: 60s
    log_checkpoints: 'on'
    log_connections: 'on'
    log_directory: pg_log
    log_disconnections: 'on'
    log_duration: 'on'
    log_filename: postgresql-%a.log
    log_line_prefix: '%m - %l - %p - %h - %u@%d - %x'
    log_lock_waits: 'on'
    log_min_duration_statement: 30s
    log_min_error_statement: NOTICE
    log_min_messages: WARNING
    log_rotation_age: '1440'
    log_statement: ddl
    log_temp_files: '0'
    log_timezone: Europe/Zurich
    log_truncate_on_rotation: 'on'
    logging_collector: 'on'
    maintenance_work_mem: 64MB
    max_replication_slots: 10
    max_wal_senders: '20'
    port: 5432
    shared_buffers: 128MB
    shared_preload_libraries: pg_stat_statements
    wal_compression: 'off'
    wal_keep_segments: 8
    wal_level: replica
    wal_log_hints: 'on'
    work_mem: 8MB
  use_pg_rewind: true
  use_slots: true
retry_timeout: 10
ttl: 30

Once saved, you get the following:

---
+++
@@ -2,7 +2,8 @@
 maximum_lag_on_failover: 1048576
 postgresql:
   parameters:
-    archive_command: /bin/true
+    archive_command: 'test ! -f /u99/pgdata/PG1/archived_wal/%f && cp %p /u99/pgdata/PG1/archived_wal/%f'
     archive_mode: 'on'
     autovacuum_max_workers: '6'
     autovacuum_vacuum_scale_factor: '0.1'

Apply these changes? [y/N]: y
Configuration changed

When connecting to the database you will see, that the parameter is changed now. It is also changed on all the other nodes.

 postgres@patroni1:/u01/app/postgres/local/dmk/etc/ [PG1] sq
psql (11.5)
Type "help" for help.

postgres=# show archive_command;
                                  archive_command
------------------------------------------------------------------------------------
 test ! -f /u99/pgdata/PG1/archived_wal/%f && cp %p /u99/pgdata/PG1/archived_wal/%f
(1 row)

2. Change parameters, that need a restart

How can parameters be changed that need a restart? Especially as we want to have a minimal downtime of the cluster.
First of all the parameter can be changed the same way as the parameters that do not need a restart using patronictl edit-config. Once the parameter is changed the status overview of the cluster gets a new column showing which node needs a restart.

postgres@patroni1:/u01/app/postgres/local/dmk/etc/ [PG1] patronictl list
+---------+----------+----------------+--------+---------+----+-----------+-----------------+
| Cluster |  Member  |      Host      |  Role  |  State  | TL | Lag in MB | Pending restart |
+---------+----------+----------------+--------+---------+----+-----------+-----------------+
|   PG1   | patroni1 | 192.168.22.111 | Leader | running |  4 |       0.0 |        *        |
|   PG1   | patroni2 | 192.168.22.112 |        | running |  4 |       0.0 |        *        |
|   PG1   | patroni3 | 192.168.22.113 |        | running |  4 |       0.0 |        *        |
+---------+----------+----------------+--------+---------+----+-----------+-----------------+

Afterwards there are two possibilites.

2.1 Restart node by node

If you do not want to restart the whole cluster, you have the possibility to restart each node separatly. Keep in mind, that you have to restart the Leader Node first, otherwise the change does not take effect. It is also possible to schedule the restart of a node.

postgres@patroni1:/u01/app/postgres/local/dmk/etc/ [PG1] patronictl restart PG1 patroni1
When should the restart take place (e.g. 2019-10-08T15:33)  [now]:
+---------+----------+----------------+--------+---------+----+-----------+-----------------+
| Cluster |  Member  |      Host      |  Role  |  State  | TL | Lag in MB | Pending restart |
+---------+----------+----------------+--------+---------+----+-----------+-----------------+
|   PG1   | patroni1 | 192.168.22.111 | Leader | running |  4 |       0.0 |        *        |
|   PG1   | patroni2 | 192.168.22.112 |        | running |  4 |       0.0 |        *        |
|   PG1   | patroni3 | 192.168.22.113 |        | running |  4 |       0.0 |        *        |
+---------+----------+----------------+--------+---------+----+-----------+-----------------+
Are you sure you want to restart members patroni1? [y/N]: y
Restart if the PostgreSQL version is less than provided (e.g. 9.5.2)  []:
Success: restart on member patroni1
postgres@patroni1:/u01/app/postgres/local/dmk/etc/ [PG1] patronictl restart PG1 patroni2
When should the restart take place (e.g. 2019-10-08T15:34)  [now]:
+---------+----------+----------------+--------+---------+----+-----------+-----------------+
| Cluster |  Member  |      Host      |  Role  |  State  | TL | Lag in MB | Pending restart |
+---------+----------+----------------+--------+---------+----+-----------+-----------------+
|   PG1   | patroni1 | 192.168.22.111 | Leader | running |  4 |       0.0 |                 |
|   PG1   | patroni2 | 192.168.22.112 |        | running |  4 |       0.0 |        *        |
|   PG1   | patroni3 | 192.168.22.113 |        | running |  4 |       0.0 |        *        |
+---------+----------+----------------+--------+---------+----+-----------+-----------------+
Are you sure you want to restart members patroni2? [y/N]: y
Restart if the PostgreSQL version is less than provided (e.g. 9.5.2)  []:
Success: restart on member patroni2
postgres@patroni1:/u01/app/postgres/local/dmk/etc/ [PG1] patronictl restart PG1 patroni3
When should the restart take place (e.g. 2019-10-08T15:34)  [now]:
+---------+----------+----------------+--------+---------+----+-----------+-----------------+
| Cluster |  Member  |      Host      |  Role  |  State  | TL | Lag in MB | Pending restart |
+---------+----------+----------------+--------+---------+----+-----------+-----------------+
|   PG1   | patroni1 | 192.168.22.111 | Leader | running |  4 |       0.0 |                 |
|   PG1   | patroni2 | 192.168.22.112 |        | running |  4 |       0.0 |                 |
|   PG1   | patroni3 | 192.168.22.113 |        | running |  4 |       0.0 |        *        |
+---------+----------+----------------+--------+---------+----+-----------+-----------------+
Are you sure you want to restart members patroni3? [y/N]: y
Restart if the PostgreSQL version is less than provided (e.g. 9.5.2)  []:
Success: restart on member patroni3
postgres@patroni1:/u01/app/postgres/local/dmk/etc/ [PG1] patronictl list
+---------+----------+----------------+--------+---------+----+-----------+
| Cluster |  Member  |      Host      |  Role  |  State  | TL | Lag in MB |
+---------+----------+----------------+--------+---------+----+-----------+
|   PG1   | patroni1 | 192.168.22.111 | Leader | running |  4 |       0.0 |
|   PG1   | patroni2 | 192.168.22.112 |        | running |  4 |       0.0 |
|   PG1   | patroni3 | 192.168.22.113 |        | running |  4 |       0.0 |
+---------+----------+----------------+--------+---------+----+-----------+
2.2 Restart the whole cluster

In case you don’t want to restart node by node and you have the possibility of a downtime, it is also possible to restart the whole cluster (scheduled or immediately)

postgres@patroni1:/u01/app/postgres/local/dmk/etc/ [PG1] patronictl restart PG1
When should the restart take place (e.g. 2019-10-08T15:37)  [now]:
+---------+----------+----------------+--------+---------+----+-----------+-----------------+
| Cluster |  Member  |      Host      |  Role  |  State  | TL | Lag in MB | Pending restart |
+---------+----------+----------------+--------+---------+----+-----------+-----------------+
|   PG1   | patroni1 | 192.168.22.111 | Leader | running |  4 |       0.0 |        *        |
|   PG1   | patroni2 | 192.168.22.112 |        | running |  4 |       0.0 |        *        |
|   PG1   | patroni3 | 192.168.22.113 |        | running |  4 |       0.0 |        *        |
+---------+----------+----------------+--------+---------+----+-----------+-----------------+
Are you sure you want to restart members patroni1, patroni2, patroni3? [y/N]: y
Restart if the PostgreSQL version is less than provided (e.g. 9.5.2)  []:
Success: restart on member patroni1
Success: restart on member patroni2
Success: restart on member patroni3
postgres@patroni1:/u01/app/postgres/local/dmk/etc/ [PG1] patronictl list
+---------+----------+----------------+--------+---------+----+-----------+
| Cluster |  Member  |      Host      |  Role  |  State  | TL | Lag in MB |
+---------+----------+----------------+--------+---------+----+-----------+
|   PG1   | patroni1 | 192.168.22.111 | Leader | running |  4 |       0.0 |
|   PG1   | patroni2 | 192.168.22.112 |        | running |  4 |       0.0 |
|   PG1   | patroni3 | 192.168.22.113 |        | running |  4 |       0.0 |
+---------+----------+----------------+--------+---------+----+-----------+

Change PostgreSQL parameters using “alter system”

Of course you can change a parameter only on one node using “alter system”, too.

 postgres@patroni1:/home/postgres/ [PG1] sq
psql (11.5)
Type "help" for help.

postgres=# show archive_Command;
 archive_command
-----------------
 /bin/false
(1 row)

postgres=# alter system set archive_command='/bin/true';
ALTER SYSTEM

postgres=# select pg_reload_conf();
 pg_reload_conf
----------------
 t
(1 row)

postgres=# show archive_command;
 archive_command
-----------------
 /bin/true
(1 row)

For sure the parameter change is not automatically applied to the replicas. The parameter is only changed on that node. All the other nodes will keep the value from the DCS. So you can change the parameter using “patronictl edit-config” or with an “alter system” command on each node. But: you also have to keep in mind the order in which the parameters are applied. The “alter system” change will persist the “patronictl edit-config” command.

Conclusion

So if you consider that there are some specialities when changing parameters in a Patroni cluster, it is quite easy to change a parameter. There are some parameters that need the same value on all nodes, e.g. max_connections, max_worker_processes, wal_level. And there are as well some parameters controlled by patroni, e.g listen_addresses and port. For a more details check the Patroni documentation . And last but not least: If you change the configuration with patronictl and one node still has another configuration. Look for a postgresql.auto.conf in the PGDATA directory. Maybe there you can find the reason for different parameters on your nodes.
If you are interested in more “Patroni Operations” blogs, check also this one Patroni operations: Switchover and Failover.

Cet article Patroni Operations – Changing Parameters est apparu en premier sur Blog dbi services.

MariaDB 10.3.1x + 10.4.x : mysqld_multi not working properly

$
0
0

A couple of days ago, we upgraded some MariaDB instances on a consolidated database server. However, after the software update we were not able to manage our instances anymore using mysqld_multi whatever the MariaDB release (10.3.18 or the latest 10.4.x).

Previously, everything worked fine with 10.3.14. So, what happened to our system? Lets’ quickly start an instance:

mysql@vmoel:/u01/app/mysql/product/mariadb-10.3.18/bin/ [mysqld3] ./mysqld_multi start 3
elseif should be elsif at /u01/app/mysql/product/mariadb-10.3.18/bin/mysqld_multi line 352.
syntax error at /u01/app/mysql/product/mariadb-10.3.18/bin/mysqld_multi line 353, near ")
      {"
syntax error at /u01/app/mysql/product/mariadb-10.3.18/bin/mysqld_multi line 356, near "else"
syntax error at /u01/app/mysql/product/mariadb-10.3.18/bin/mysqld_multi line 404, near "}"
Illegal declaration of subroutine main::stop_mysqlds at /u01/app/mysql/product/mariadb-10.3.18/bin/mysqld_multi line 416.

Actually, the error message is quite simple to understand and to fix. We let you check & fix the bug by yourself (hint : “elseif should be elsif”) 😎

That’s it for now but does your MariaDB instances start now? Hum, try it and check your error log like we did:

2019-10-30 20:53:33 0 [Note] InnoDB: 10.3.18 started; log sequence number 1630952; transaction id 21
2019-10-30 20:53:33 0 [Note] InnoDB: Loading buffer pool(s) from /u02/mysqldata/mysqld3/ib_buffer_pool
2019-10-30 20:53:33 0 [Note] Plugin 'FEEDBACK' is disabled.
2019-10-30 20:53:33 0 [ERROR] /u01/app/mysql/product/mariadb-10.3.18/bin/mysqld: unknown variable 'defaults-group-suffix=mysqld3'
2019-10-30 20:53:33 0 [ERROR] Aborting

Still not working ? Hum, now after some further analysis of the perl code especially the perl routine “start_mysqlds” we could identify the real problem.

While, writing this article we found someone already reported that bug MDEV-20728 !  Thanks to him

The fix is directly available on the MariaDB GitHub repository

Good luck 🙂

Cet article MariaDB 10.3.1x + 10.4.x : mysqld_multi not working properly est apparu en premier sur Blog dbi services.

SLES15 SP1 – New features

$
0
0

With SLES15 SUSE introduced the Multimodal OS and the unified installer. Means, you only get what you really need. Your OS is flexible and you can easily add features you need and remove them as well. But this article shouldn’t be an explanation of the multimodal OS, it will show you some of the new features of SLES15 SP1.

SUSE supports the migration from SLES15 to SLES15 SP1 in online mode.
You can upgrade using two possibilities, YaST migration (GUI) and Zypper migration (command line).
Be sure that your system is registered at the SUSE Customer Center or has a local RMT server. Afterwards, just use “zypper migration”, type the number of the product you want to migrate and accept the terms of the license. That’s it.
The best way to check, if the installation was successful.

sles15:~ # cat /etc/os-release | grep PRETTY_NAME
PRETTY_NAME="SUSE Linux Enterprise Server 15 SP1"

So let’s have a look at the new features and improvements of SLES15 SP1 .

Unified Installer

SUSE Manager Server and Proxy are now available as base products. Both can be installed using the unified installer.
Point of Service and SLE Real Time are also included in the unified installer now.

Transactional Update

In OpenSUSE Leap and SUSE CaaS transactional update was already implemented, now it is also possible to run transactional updates with SLE. To install transactional update, the Transactional Server Module needs to get activated first (no additional key is needed). Afterwards the transactional-update package and its dependencies can be installed.

sle15:~ #  SUSEConnect --product sle-module-transactional-server/15.1/x86_64
Registering system to SUSE Customer Center

Updating system details on https://scc.suse.com ...

Activating sle-module-transactional-server 15.1 x86_64 ...
-> Adding service to system ...
-> Installing release package ...

Successfully registered system
sle15:~ # zypper install transactional-update
Refreshing service 'Basesystem_Module_15_SP1_x86_64'.
Refreshing service 'SUSE_Linux_Enterprise_Server_15_SP1_x86_64'.
Refreshing service 'Server_Applications_Module_15_SP1_x86_64'.
Refreshing service 'Transactional_Server_Module_15_SP1_x86_64'.
Loading repository data...
Reading installed packages...
Resolving package dependencies...

The following 6 NEW packages are going to be installed:
  attr bc openslp psmisc rsync transactional-update

6 new packages to install.
Overall download size: 686.6 KiB. Already cached: 0 B. After the operation, additional 1.3 MiB will be used.
Continue? [y/n/v/...? shows all options] (y): y

As you maybe know, SUSE uses btrfs with snapper as default for the file systems. This builds the basis for the transactional updates. Transactional updates are applied into a new snapshot, so the running system is not touched. Using this technology, the updated snapshot will be activated after the next reboot. So this is an update, that is
– Atomic: either fully applied or not.
– Easily revertabled: after a failed update the return to the previous (running) system is easy.

Simplified btrfs layout

There is only one single subvolume under /var not 10 for simplified and consistens snapshots. This takes only effect for fresh installations. Upgraded systems still use the old layout.
Startings with SLES15 SP1 there is also the possibility to have each home-directory as single subvolume. But this is not the default.

Secure encrypted virtualization (SEV)

Data encryption is a important topic in todays IT environments. Data stored on disk is widley encrypted, but how about the data in RAM? AMD’s SEV gives the opportunity to protect Linux KVM virtual machines by encrypting the memory of each VM with a unique key. It can also generate a signature, that attests the correct encryption.
This increases system security a lot and protects VM for memory scrape attachs from hypervisor.
With SLES15 SP1, Suse provides full support for this technology. For further information about SEV, click here .

Quarterly Updates

Starting with 15 SP1 SUSE offers quarterly updates of the installation and package media. They will be refreshed every quarter with all maintenance and security updates. SO for the setup of new systems there is always a recent and up-to-date state.

Conclusion

This is not the full list of new features, only an abstract. But nevertheless, especially the transactional update makes it effortable to upgrade to SLES15 SP1. And always think about the security improvements which come with every new release.

Cet article SLES15 SP1 – New features est apparu en premier sur Blog dbi services.

AEM Forms – WebLogic Clustering synch issue for Workbench 2-way-SSL

$
0
0

In a previous blog, I described the process to setup the AEM Forms to allow the AEM Workbench to connect to AEM using “2-way-SSL”. This setup is normally independent of the Application Server that you are using to host AEM. However, I already faced an issue (other than this one) which was caused by the 2-way-SSL setup for the Workbench in case of a WebLogic Cluster has been used to host AEM.

As mentioned in previous blog, I’m not an AEM expert but I know a few things about WebLogic so the idea here was to setup a fully functional WebLogic Cluster composed of two Managed Servers on two hosts/machines, test it properly and then install the AEM Forms application on top of it. Obviously, AEM Forms has been configured behind a Load Balancer for this purpose. At this point, AEM Forms was working perfectly in HA and stopping one of the nodes wasn’t a problem.

Then I configured the Workbench for 2-way-SSL and I did so while being connected to the AEM Node1 in Workbench, creating the Hybrid Domain in the AEM AdminUI Node1, aso… At the end of the setup, the AEM Workbench was working properly with the 2-way-SSL setup as well so it looked like the setup was completed. Just to be sure, I stopped the AEM Node1 and try to login to the AEM Workbench with the exact same parameters (same keystore, same truststore, same passwords) except for the target Server which I switched to the AEM Node2. Doing so, the login failed and I could see in the AEM Node2 Managed Server logs the following message:

####<Feb 12, 2019 2:14:46,277 PM UTC> <Info> <EJB> <aem-node-2> <msAEM-02> <[ACTIVE] ExecuteThread: '76' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <> <81fe4dac-31f0-4c25-bf37-17d5b327a901-0000005e> <1549980886277> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-010227> <EJB exception occurred during invocation from home or business: com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionBMTAdapter_fw85em_Intf generated exception: ALC-DSC-124-000: com.adobe.idp.dsc.DSCAuthorizationException: User does not have the Service Read Permission.>

 
Just looking at this message, it’s clear that the user account that is working properly for the AEM Node1 isn’t working for the AEM Node2. After some investigation, it looked like the Hybrid Domain wasn’t shown on the AEM AdminUI Node2, for some reason… Both nodes are using the same Oracle Database and the same GDS (Global Document Storage) path so I thought that the issue might be related to a cache somewhere in AEM. Therefore, I thought about re-creating the Hybrid Domain but I just cancelled this move right away because I assume it could have bring me more trouble than solution (I didn’t want to create 2 objects with the same name, avoid corruption or whatever…):

  • Open the AEM AdminUI Node2 (Node1 is still down) (https://<AEM_HOST_2>:<AEM_PORT>/adminui)
  • Login with an administative account (E.g.: administrator)
  • Navigate to: Settings > User Management > Domain Management
    • -> Only 1 domain is displayed, the default one: DefaultDom
  • Click on “New Hybrid Domain
    • Click on “Cancel”

 
After doing that, the Hybrid Domain (the one created in this blog, named “SSLMutualAuthProvider“) magically appeared so I assume that it forced a synchronization and an update of the cache on the AEM Node2. Trying again a login to the AEM Workbench without changing the parameters printed the following on the AEM Node2 Managed Server logs:

####<Feb 12, 2019 2:30:43,208 PM UTC> <Info> <com.adobe.livecycle.usermanager.sslauthprovider.SSLMutualAuthProvider> <aem-node-2> <msAEM-02> <[ACTIVE] ExecuteThread: '117' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-24A18C6CA9D79C032EFA> <81fe4dac-31f0-4c25-bf37-17d5b327a901-00000067> <1549981843208> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Got Subject DN as CN=aem-dev,OU=IT,O=dbi services,L=Delemont,ST=Jura,C=CH>

 
The above message means that the login is successful and Workbench is able to load the data from AEM properly. I guess that there are other ways to fix this issue. There is a “Sync Now” as well as a “Refresh” button on the Domain Management page of the AdminUI so maybe this would have done the same thing and forced a synchronization… I must admit that I first thought about re-creating the Hybrid Domain but cancelled that and since it solved my issue, I couldn’t test more, unfortunately. A restart of the AEM Node2 is also sufficient to force a refresh but this takes a few minutes and it requires a downtime so it’s not ideal.

Cet article AEM Forms – WebLogic Clustering synch issue for Workbench 2-way-SSL est apparu en premier sur Blog dbi services.

AEM Forms – Certify PDF end-up with NoSuchMethodError on bouncycastle

$
0
0

As part of an AEM project, we were working on setting up a few actions on PDF files. One of these actions was to Sign & Certify a PDF file. The basic Sign & Certify action provided by AEM is working easily by default but if you look deeper, you might get some surprise. The complexity in this case came from the fact that we absolutely needed the signature to contain a valid Time-Stamp using the Time-Stamp Protocol (TSP) as well as a valid Long-Term Validation (LTV). In this blog, I will talk about one (of the numerous) issue we faced that I believe is related only to AEM on WebLogic.

As I mentioned above, the basic Certify operation is working easily but if you do not take a closer look, it might not be TSP and/or LTV. In our case, using AEM 6.4 SP3 on WebLogic Server 12.2.1.3, we got the Certify operation to work but without TSP & LTV:

Certify PDF - TSP failed & LTV failed

Looking at the AEM Managed Server logs, you can see that the last line is an error message:

####<Aug 28, 2019 12:15:22,278 PM UTC> <Info> <com.adobe.livecycle.usermanager.sslauthprovider.SSLMutualAuthProvider> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '16' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-129013562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000055> <1566994522278> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Got Subject DN as CN=aem-dev,OU=IT,O=dbi services,L=Delemont,ST=Jura,C=CH>
####<Aug 28, 2019 12:15:25,025 PM UTC> <Info> <com.adobe.livecycle.usermanager.sslauthprovider.SSLMutualAuthProvider> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-129513562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994525025> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Got Subject DN as CN=aem-dev,OU=IT,O=dbi services,L=Delemont,ST=Jura,C=CH>
####<Aug 28, 2019 12:15:25,680 PM UTC> <Info> <com.adobe.formServer.config.FormServerConfigImpl> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-12C213562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994525680> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <FSC008: Using the database to access and persist configuration properties.>
####<Aug 28, 2019 12:15:25,681 PM UTC> <Info> <com.adobe.formServer.config.FormServerConfigImpl> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-12C213562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994525681> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <FSC001: The property LastCacheResetTime has been changed from  to 1555070921173>
####<Aug 28, 2019 12:15:25,681 PM UTC> <Info> <com.adobe.formServer.config.FormServerConfigImpl> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-12C213562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994525681> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <FSC001: The property CacheValidationTime has been changed from 0 to 1555070921058>
####<Aug 28, 2019 12:15:25,684 PM UTC> <Info> <com.adobe.formServer.common.cachemanager.CacheConfig> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-12C213562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994525684> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Initializing cache from default values >
####<Aug 28, 2019 12:15:26,130 PM UTC> <Info> <Common> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-130E13562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994526130> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000628> <Created "1" resources for pool "IDP_DS", out of which "1" are available and "0" are unavailable.>
####<Aug 28, 2019 12:15:26,141 PM UTC> <Info> <com.adobe.formServer.common.cachemanager.CacheConfig> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-12C213562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994526141> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Initializing cache from default values >
####<Aug 28, 2019 12:15:26,147 PM UTC> <Info> <com.adobe.formServer.common.cachemanager.CacheConfig> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-12C213562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994526147> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Initializing cache from default values >
####<Aug 28, 2019 12:15:26,153 PM UTC> <Info> <com.adobe.formServer.common.cachemanager.CacheConfig> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-12C213562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994526153> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Initializing cache from default values >
####<Aug 28, 2019 12:15:26,158 PM UTC> <Info> <com.adobe.formServer.common.cachemanager.CacheConfig> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-12C213562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994526158> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Initializing cache from default values >
####<Aug 28, 2019 12:15:26,571 PM UTC> <Info> <com.adobe.formServer.config.FormServerConfigImpl> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-12C213562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994526571> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <FSC008: Using the database to access and persist configuration properties.>
####<Aug 28, 2019 12:15:27,835 PM UTC> <Info> <com.adobe.livecycle.usermanager.sslauthprovider.SSLMutualAuthProvider> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '60' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-13A613562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000057> <1566994527835> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Got Subject DN as CN=aem-dev,OU=IT,O=dbi services,L=Delemont,ST=Jura,C=CH>
####<Aug 28, 2019 12:15:30,923 PM UTC> <Error> <com.adobe.workflow.AWS> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-12C213562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994530923> <[severity-value: 8] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <An exception was thrown with name java.lang.NoSuchMethodError message:org.bouncycastle.asn1.x509.AlgorithmIdentifier.getObjectId()Lorg/bouncycastle/asn1/ASN1ObjectIdentifier; while invoking service SignatureService and operation certify and no fault routes were found to be configured.>

 

At the same time, we also got this kind of messages:

ALC-DSC-003-000: com.adobe.idp.dsc.DSCInvocationException: Invocation error.
            at com.adobe.idp.dsc.component.impl.DefaultPOJOInvokerImpl.invoke(DefaultPOJOInvokerImpl.java:152)
            at com.adobe.idp.dsc.interceptor.impl.InvocationInterceptor.intercept(InvocationInterceptor.java:140)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.interceptor.impl.DocumentPassivationInterceptor.intercept(DocumentPassivationInterceptor.java:53)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.transaction.interceptor.TransactionInterceptor$1.doInTransaction(TransactionInterceptor.java:74)
            at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapterBean.execute(EjbTransactionCMTAdapterBean.java:357)
            at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapterBean.doRequired(EjbTransactionCMTAdapterBean.java:274)
            at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapter_yjcxi4_ELOImpl.__WL_invoke(Unknown Source)
            at weblogic.ejb.container.internal.SessionLocalMethodInvoker.invoke(SessionLocalMethodInvoker.java:33)
            at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapter_yjcxi4_ELOImpl.doRequired(Unknown Source)
            at com.adobe.idp.dsc.transaction.impl.ejb.EjbTransactionProvider.execute(EjbTransactionProvider.java:129)
            at com.adobe.idp.dsc.transaction.interceptor.TransactionInterceptor.intercept(TransactionInterceptor.java:72)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.interceptor.impl.InvocationStrategyInterceptor.intercept(InvocationStrategyInterceptor.java:55)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.interceptor.impl.InvalidStateInterceptor.intercept(InvalidStateInterceptor.java:37)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.interceptor.impl.AuthorizationInterceptor.intercept(AuthorizationInterceptor.java:188)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.interceptor.impl.JMXInterceptor.intercept(JMXInterceptor.java:48)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.engine.impl.ServiceEngineImpl.invoke(ServiceEngineImpl.java:121)
            at com.adobe.idp.dsc.routing.Router.routeRequest(Router.java:131)
            at com.adobe.idp.dsc.provider.impl.base.AbstractMessageReceiver.routeMessage(AbstractMessageReceiver.java:93)
            at com.adobe.idp.dsc.provider.impl.vm.VMMessageDispatcher.doSend(VMMessageDispatcher.java:225)
            at com.adobe.idp.dsc.provider.impl.base.AbstractMessageDispatcher.send(AbstractMessageDispatcher.java:69)
            at com.adobe.idp.dsc.clientsdk.ServiceClient.invoke(ServiceClient.java:215)
            at com.adobe.workflow.engine.PEUtil.invokeAction(PEUtil.java:893)
            at com.adobe.idp.workflow.dsc.invoker.WorkflowDSCInvoker.transientInvoke(WorkflowDSCInvoker.java:356)
            at com.adobe.idp.workflow.dsc.invoker.WorkflowDSCInvoker.invoke(WorkflowDSCInvoker.java:159)
            at com.adobe.idp.dsc.interceptor.impl.InvocationInterceptor.intercept(InvocationInterceptor.java:140)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.interceptor.impl.DocumentPassivationInterceptor.intercept(DocumentPassivationInterceptor.java:53)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.transaction.interceptor.TransactionInterceptor$1.doInTransaction(TransactionInterceptor.java:74)
            at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapterBean.execute(EjbTransactionCMTAdapterBean.java:357)
            at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapterBean.doRequiresNew(EjbTransactionCMTAdapterBean.java:299)
            at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapter_yjcxi4_ELOImpl.__WL_invoke(Unknown Source)
            at weblogic.ejb.container.internal.SessionLocalMethodInvoker.invoke(SessionLocalMethodInvoker.java:33)
            at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapter_yjcxi4_ELOImpl.doRequiresNew(Unknown Source)
            at com.adobe.idp.dsc.transaction.impl.ejb.EjbTransactionProvider.execute(EjbTransactionProvider.java:143)
            at com.adobe.idp.dsc.transaction.interceptor.TransactionInterceptor.intercept(TransactionInterceptor.java:72)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.interceptor.impl.InvocationStrategyInterceptor.intercept(InvocationStrategyInterceptor.java:55)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.interceptor.impl.InvalidStateInterceptor.intercept(InvalidStateInterceptor.java:37)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.interceptor.impl.AuthorizationInterceptor.intercept(AuthorizationInterceptor.java:188)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.interceptor.impl.JMXInterceptor.intercept(JMXInterceptor.java:48)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.engine.impl.ServiceEngineImpl.invoke(ServiceEngineImpl.java:121)
            at com.adobe.idp.dsc.routing.Router.routeRequest(Router.java:131)
            at com.adobe.idp.dsc.provider.impl.base.AbstractMessageReceiver.invoke(AbstractMessageReceiver.java:329)
            at com.adobe.idp.dsc.provider.impl.soap.axis.sdk.SoapSdkEndpoint.invokeCall(SoapSdkEndpoint.java:153)
            at com.adobe.idp.dsc.provider.impl.soap.axis.sdk.SoapSdkEndpoint.invoke(SoapSdkEndpoint.java:91)
            at sun.reflect.GeneratedMethodAccessor621.invoke(Unknown Source)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:498)
            at org.apache.axis.providers.java.RPCProvider.invokeMethod(RPCProvider.java:397)
            at org.apache.axis.providers.java.RPCProvider.processMessage(RPCProvider.java:186)
            at org.apache.axis.providers.java.JavaProvider.invoke(JavaProvider.java:323)
            at org.apache.axis.strategies.InvocationStrategy.visit(InvocationStrategy.java:32)
            at org.apache.axis.SimpleChain.doVisiting(SimpleChain.java:118)
            at org.apache.axis.SimpleChain.invoke(SimpleChain.java:83)
            at org.apache.axis.handlers.soap.SOAPService.invoke(SOAPService.java:454)
            at org.apache.axis.server.AxisServer.invoke(AxisServer.java:281)
            at org.apache.axis.transport.http.AxisServlet.doPost(AxisServlet.java:699)
            at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
            at org.apache.axis.transport.http.AxisServletBase.service(AxisServletBase.java:327)
            at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
            at weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:286)
            at weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:260)
            at weblogic.servlet.internal.StubSecurityHelper.invokeServlet(StubSecurityHelper.java:137)
            at weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:350)
            at weblogic.servlet.internal.TailFilter.doFilter(TailFilter.java:25)
            at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:78)
            at com.adobe.idp.dsc.provider.impl.soap.axis.InvocationFilter.doFilter(InvocationFilter.java:43)
            at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:78)
            at com.adobe.idp.um.auth.filter.ParameterFilter.doFilter(ParameterFilter.java:105)
            at com.adobe.idp.um.auth.filter.CSRFFilter.invokeNextFilter(CSRFFilter.java:141)
            at com.adobe.idp.um.auth.filter.CSRFFilter.doFilter(CSRFFilter.java:132)
            at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:78)
            at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.wrapRun(WebAppServletContext.java:3706)
            at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:3672)
            at weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:328)
            at weblogic.security.service.SecurityManager.runAsForUserCode(SecurityManager.java:197)
            at weblogic.servlet.provider.WlsSecurityProvider.runAsForUserCode(WlsSecurityProvider.java:203)
            at weblogic.servlet.provider.WlsSubjectHandle.run(WlsSubjectHandle.java:71)
            at weblogic.servlet.internal.WebAppServletContext.doSecuredExecute(WebAppServletContext.java:2443)
            at weblogic.servlet.internal.WebAppServletContext.securedExecute(WebAppServletContext.java:2291)
            at weblogic.servlet.internal.WebAppServletContext.execute(WebAppServletContext.java:2269)
            at weblogic.servlet.internal.ServletRequestImpl.runInternal(ServletRequestImpl.java:1705)
            at weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1665)
            at weblogic.servlet.provider.ContainerSupportProviderImpl$WlsRequestExecutor.run(ContainerSupportProviderImpl.java:272)
            at weblogic.invocation.ComponentInvocationContextManager._runAs(ComponentInvocationContextManager.java:352)
            at weblogic.invocation.ComponentInvocationContextManager.runAs(ComponentInvocationContextManager.java:337)
            at weblogic.work.LivePartitionUtility.doRunWorkUnderContext(LivePartitionUtility.java:57)
            at weblogic.work.PartitionUtility.runWorkUnderContext(PartitionUtility.java:41)
            at weblogic.work.SelfTuningWorkManagerImpl.runWorkUnderContext(SelfTuningWorkManagerImpl.java:652)
            at weblogic.work.ExecuteThread.execute(ExecuteThread.java:420)
            at weblogic.work.ExecuteThread.run(ExecuteThread.java:360)
Caused by: java.lang.NoSuchMethodError: org.bouncycastle.asn1.x509.AlgorithmIdentifier.getObjectId()Lorg/bouncycastle/asn1/ASN1ObjectIdentifier;
            at com.adobe.livecycle.signatures.pki.timestamp.TimestampInfoBC.matchesMessageImprint(TimestampInfoBC.java:187)
            at com.adobe.livecycle.signatures.pki.timestamp.TimestampToken.validateRequest(TimestampToken.java:430)
            at com.adobe.livecycle.signatures.pki.impl.PKIOperations.createTimestamp(PKIOperations.java:562)
            at com.adobe.livecycle.signatures.service.impl.TimeStampProviderImpl.getTimestampToken(TimeStampProviderImpl.java:85)
            at com.adobe.idp.cryptoprovider.LCPKCS7Signer$1.getActualAttributes(LCPKCS7Signer.java:256)
            at com.adobe.livecycle.signatures.pki.signature.CMSPKCS7Impl.sign(CMSPKCS7Impl.java:702)
            at com.adobe.livecycle.signatures.pki.impl.PKIOperations.sign(PKIOperations.java:345)
            at com.adobe.livecycle.signatures.service.cryptoprovider.DSSPKCS7Signer.signData(DSSPKCS7Signer.java:84)
            at com.adobe.idp.cryptoprovider.LCPKCS7Signer.sign(LCPKCS7Signer.java:123)
            at com.adobe.internal.pdftoolkit.services.digsig.digsigframework.impl.SignatureHandlerPPKLite.writeSignatureAfterSave(SignatureHandlerPPKLite.java:816)
            at com.adobe.internal.pdftoolkit.services.digsig.impl.SigningUtils.doSigning(SigningUtils.java:820)
            at com.adobe.internal.pdftoolkit.services.digsig.SignatureManager.certifyWrapperAPI(SignatureManager.java:1554)
            at com.adobe.internal.pdftoolkit.services.digsig.SignatureManager.certify(SignatureManager.java:1542)
            at com.adobe.livecycle.signatures.service.impl.SignCertifyImpl.certify(SignCertifyImpl.java:894)
            at com.adobe.livecycle.signatures.service.impl.DocumentSecurityService.certify(DocumentSecurityService.java:1644)
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:498)
            at com.adobe.idp.dsc.component.impl.DefaultPOJOInvokerImpl.invoke(DefaultPOJOInvokerImpl.java:118)
            ... 102 more

 

Based on the above messages, it is clear that there is a problem with some of the bouncycastle classes. This kind of thing is usually a missing class (“ClassNotFoundException“) or a conflict between two or more versions that are loaded by WebLogic (“NoSuchMethodError“) with the loaded/active version not containing the specific java method that is being called. We opened a SR with the Adobe Support because this kind of thing shouldn’t be happening but after a few days without any meaningful update from them, I decided to look into the product myself to stop losing time on such trivial thing.

So this specific class (“org.bouncycastle.asn1.x509.AlgorithmIdentifier“) can be found in numerous jar files: apacheds*.jar, bcprov*.jar, bouncycastle*.jar, ec2*.jar, aso… I checked all these jar files on our WebLogic Server libraries as well as AEM ones and found what I believe was the issue: different versions of these jars being loaded. To confirm and before changing anything, I deployed the WebLogic CAT and found:

  • 0 conflicts in adobe-livecycle-cq-author.ear
  • 0 conflicts in adobe-livecycle-native-weblogic-x86_linux.ear
  • 5339 conflicts in adobe-livecycle-weblogic.ear

 
These numbers pretty much confirmed what I thought already. Going further, I found a few hundred conflicts related to the “org.bouncycastle.*” classes only. One of these being for the class “org.bouncycastle.asn1.x509.AlgorithmIdentifier” and it was conflicting between the following files:

  • WebLogic: $MW_HOME/oracle_common/modules/org.bouncycastle.bcprov-jdk15on.jar (1st loaded)
  • WebLogic: $MW_HOME/oracle_common/modules/org.bouncycastle.bcprov-ext-jdk15on.jar
  • AEM: $APPLICATIONS/adobe-livecycle-weblogic.ear/bcprov-151.jar

 
So what should be done to fix this? Well, a simple solution is just to force WebLogic to use the AEM provided files by default by updating the load preferences:

[weblogic@aem-node-1 ~]$ cd $APPLICATIONS
[weblogic@aem-node-1 AEM]$ 
[weblogic@aem-node-1 AEM]$ jar -xvf adobe-livecycle-weblogic.ear META-INF/weblogic-application.xml
[weblogic@aem-node-1 AEM]$ 
[weblogic@aem-node-1 AEM]$ grep -B1 "</prefer-application-packages>" META-INF/weblogic-application.xml
<package-name>org.mozilla.javascript.xmlimpl.*</package-name>
</prefer-application-packages>
[weblogic@aem-node-1 AEM]$ 
[weblogic@aem-node-1 AEM]$ sed -i 's,</prefer-application-packages>,<package-name>org.bouncycastle.*</package-name>\n&,' META-INF/weblogic-application.xml
[weblogic@aem-node-1 AEM]$ 
[weblogic@aem-node-1 AEM]$ grep -B2 "</prefer-application-packages>" META-INF/weblogic-application.xml
<package-name>org.mozilla.javascript.xmlimpl.*</package-name>
<package-name>org.bouncycastle.*</package-name>
</prefer-application-packages>
[weblogic@aem-node-1 AEM]$ 
[weblogic@aem-node-1 AEM]$ jar -uvf adobe-livecycle-weblogic.ear META-INF/weblogic-application.xml
[weblogic@aem-node-1 AEM]$ rm -rf META-INF
[weblogic@aem-node-1 AEM]$

 

What the above commands are doing is simply to add “<package-name>org.bouncycastle.*</package-name>” just before the end of the “<prefer-application-packages>” section so that WebLogic will know that it needs to use the AEM provided classes for this package and it shouldn’t use its own files. Once that is done, simply redeploy the EAR file. In my case, I was left with “only” 2442 conflicts, none regarding the bouncycastle (obviously).

After that, executing the same Certify action with the new classloader preferences resulted in no more errors:

####<Aug 28, 2019 1:12:22,359 PM UTC> <Info> <com.adobe.livecycle.usermanager.sslauthprovider.SSLMutualAuthProvider> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '109' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-1475E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000071> <1566997942359> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Got Subject DN as CN=aem-dev,OU=IT,O=dbi services,L=Delemont,ST=Jura,C=CH>
####<Aug 28, 2019 1:12:23,702 PM UTC> <Info> <com.adobe.livecycle.usermanager.sslauthprovider.SSLMutualAuthProvider> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-147BE729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997943702> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Got Subject DN as CN=aem-dev,OU=IT,O=dbi services,L=Delemont,ST=Jura,C=CH>
####<Aug 28, 2019 1:12:24,199 PM UTC> <Info> <com.adobe.formServer.config.FormServerConfigImpl> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-14A8E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997944199> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <FSC008: Using the database to access and persist configuration properties.>
####<Aug 28, 2019 1:12:24,199 PM UTC> <Info> <com.adobe.formServer.config.FormServerConfigImpl> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-14A8E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997944199> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <FSC001: The property LastCacheResetTime has been changed from  to 1555070921173>
####<Aug 28, 2019 1:12:24,200 PM UTC> <Info> <com.adobe.formServer.config.FormServerConfigImpl> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-14A8E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997944200> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <FSC001: The property CacheValidationTime has been changed from 0 to 1555070921058>
####<Aug 28, 2019 1:12:24,202 PM UTC> <Info> <com.adobe.formServer.common.cachemanager.CacheConfig> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-14A8E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997944202> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Initializing cache from default values >
####<Aug 28, 2019 1:12:24,691 PM UTC> <Info> <Common> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-14F2E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997944691> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000628> <Created "1" resources for pool "IDP_DS", out of which "1" are available and "0" are unavailable.>
####<Aug 28, 2019 1:12:24,704 PM UTC> <Info> <com.adobe.formServer.common.cachemanager.CacheConfig> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-14A8E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997944704> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Initializing cache from default values >
####<Aug 28, 2019 1:12:24,710 PM UTC> <Info> <com.adobe.formServer.common.cachemanager.CacheConfig> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-14A8E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997944710> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Initializing cache from default values >
####<Aug 28, 2019 1:12:24,717 PM UTC> <Info> <com.adobe.formServer.common.cachemanager.CacheConfig> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-14A8E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997944717> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Initializing cache from default values >
####<Aug 28, 2019 1:12:24,724 PM UTC> <Info> <com.adobe.formServer.common.cachemanager.CacheConfig> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-14A8E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997944724> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Initializing cache from default values >
####<Aug 28, 2019 1:12:24,928 PM UTC> <Info> <com.adobe.formServer.config.FormServerConfigImpl> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-14A8E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997944928> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <FSC008: Using the database to access and persist configuration properties.>

 

The generated PDF now contained a correct Time-Stamp information but still not LTV information:

Certify PDF - TSP working & LTV failed

Finally, adding a Validation step after the Certify step in the process (in the AEM Application (LCA)) allowed both TSP and LTV information to be shown properly:

Certify PDF - TSP working & LTV working

Cet article AEM Forms – Certify PDF end-up with NoSuchMethodError on bouncycastle est apparu en premier sur Blog dbi services.


pg_auto_failover: Setup and installation

$
0
0

When I attended PGIBZ 2019 earlier this year, I talked with Dimitri about pg_auto_failover and I promised to have a look at it. Well, almost half a year later and after we’ve met again at pgconf.eu it is time to actually do that. You probably already know that citudata was acquired by Microsoft earlier this year and that Microsoft seems to be committed to open source since a few years. pg_auto_failover is one of the projects they contribute back to the PostgreSQL community. This will be a multi-blog series and in this very first post it is all about getting it up and running. In a following post we will then look at failover and switchover scenarios.

As usual, when you need auto failover you need at least three nodes and pg_auto_failover is no exception to that. The following graphic is stolen from the pg_auto_failover github page:

We have one PostgreSQL master, one PostgreSQL replica and in addition a monitoring host. In may case that maps to:

pg-af1.ti.dbi-services.com master 192.168.22.70
pg-af2.ti.dbi-services.com replica 192.168.22.71 pg-af3.ti.dbi-services.com monitor/cluster management 192.168.22.72

All of these nodes run CentOS 8 and I will be going from source code as that gives most flexibility. As pg_auto_failover depends on PostgreSQL (of course) the first step is to install PostgreSQL on all three nodes (PostgreSQL 12 in this setup). If you need further information on how to do that you can e.g. check here. Basically these steps have been executed on all the three nodes (given that the postgres user already exists and sudo is configured):

[postgres@pg-af1 ~]$ sudo dnf install -y gcc openldap-devel python36-devel readline-devel redhat-lsb bison flex perl-ExtUtils-Embed zlib-devel openssl-devel pam-devel libxml2-devel libxslt-devel openssh-clients bzip2 net-tools wget unzip sysstat xorg-x11-xauth systemd-devel bash-completion python36 policycoreutils-python-utils make git
[postgres@pg-af1 ~]$ wget https://ftp.postgresql.org/pub/source/v12.0/postgresql-12.0.tar.bz2
[postgres@pg-af1 ~]$ tar -axf postgresql-12.0.tar.bz2
[postgres@pg-af1 ~]$ cd postgresql-12.0
[postgres@pg-af1 postgresql-12.0]$ sudo mkdir -p /u01 /u02
[postgres@pg-af1 postgresql-12.0]$ sudo chown postgres:postgres /u01 /u02
[postgres@pg-af1 postgresql-12.0]$ PGHOME=/u01/app/postgres/product/12/db_0/
[postgres@pg-af1 postgresql-12.0]$ SEGSIZE=2
[postgres@pg-af1 postgresql-12.0]$ BLOCKSIZE=8
[postgres@pg-af1 postgresql-12.0]$ WALSEGSIZE=64
[postgres@pg-af1 postgresql-12.0]$ ./configure --prefix=${PGHOME} \
> --exec-prefix=${PGHOME} \
> --bindir=${PGHOME}/bin \
> --libdir=${PGHOME}/lib \
> --sysconfdir=${PGHOME}/etc \
> --includedir=${PGHOME}/include \
> --datarootdir=${PGHOME}/share \
> --datadir=${PGHOME}/share \
> --with-pgport=5432 \
> --with-perl \
> --with-python \
> --with-openssl \
> --with-pam \
> --with-ldap \
> --with-libxml \
> --with-libxslt \
> --with-segsize=${SEGSIZE} \
> --with-blocksize=${BLOCKSIZE} \
> --with-systemd \
> --with-extra-version=" dbi services build"
[postgres@pg-af1 postgresql-12.0]$ make all
[postgres@pg-af1 postgresql-12.0]$ make install
[postgres@pg-af1 postgresql-12.0]$ cd contrib
[postgres@pg-af1 contrib]$ make install
[postgres@pg-af1 contrib]$ cd ../..
[postgres@pg-af1 ~]$ rm -rf postgresql*

We will go for an installation from source code of pg_auto_failover as well (again, on all three nodes):

postgres@pg-af1:/home/postgres/ [pg120] git clone https://github.com/citusdata/pg_auto_failover.git
postgres@pg-af1:/home/postgres/ [pg120] cd pg_auto_failover/
postgres@pg-af1:/home/postgres/pg_auto_failover/ [pg120] make
postgres@pg-af1:/home/postgres/pg_auto_failover/ [pg120] make install
postgres@pg-af1:/home/postgres/pg_auto_failover/ [pg120] cd ..
postgres@pg-af1:/home/postgres/ [pg120] rm -rf pg_auto_failover/

That’s it, quite easy. What I like especially is, that there are no dependencies on python or any other libraries except for PostgreSQL. What the installation gives us is basically pg_autoctl:

postgres@pg-af1:/home/postgres/ [pg120] pg_autoctl --help
pg_autoctl: pg_auto_failover control tools and service
usage: pg_autoctl [ --verbose --quiet ]


Available commands:
pg_autoctl
+ create   Create a pg_auto_failover node, or formation
+ drop     Drop a pg_auto_failover node, or formation
+ config   Manages the pg_autoctl configuration
+ show     Show pg_auto_failover information
+ enable   Enable a feature on a formation
+ disable  Disable a feature on a formation
run      Run the pg_autoctl service (monitor or keeper)
stop     signal the pg_autoctl service for it to stop
reload   signal the pg_autoctl for it to reload its configuration
help     print help message
version  print pg_autoctl version

The first step in setting up the cluster is to initialize the monitoring node:

postgres@pg-af3:/home/postgres/ [pg120] pg_autoctl create --help
pg_autoctl create: Create a pg_auto_failover node, or formation

Available commands:
pg_autoctl create
monitor    Initialize a pg_auto_failover monitor node
postgres   Initialize a pg_auto_failover standalone postgres node
formation  Create a new formation on the pg_auto_failover monitor

postgres@pg-af3:/home/postgres/ [pg120] sudo mkdir -p /u02/pgdata
postgres@pg-af3:/home/postgres/ [pg120] sudo chown postgres:postgres /u02/pgdata
postgres@pg-af3:/home/postgres/ [pg120] unset PGDATABASE
postgres@pg-af3:/home/postgres/ [] pg_autoctl create monitor --pgdata /u02/pgdata/PG12/af
INFO  Initialising a PostgreSQL cluster at "/u02/pgdata/PG12/af"
INFO   /u01/app/postgres/product/12/db_0/bin/pg_ctl --pgdata /u02/pgdata/PG12/af --options "-p 5432" --options "-h *" --waitstart
INFO  Granting connection privileges on 192.168.22.0/24
INFO  Your pg_auto_failover monitor instance is now ready on port 5432.
INFO  pg_auto_failover monitor is ready at postgres://autoctl_node@pg-af3:5432/pg_auto_failover
INFO  Monitor has been succesfully initialized.

Once that succeeds you’ll a new PostgreSQL instance running and pg_auto_failover PostgreSQL background worker processes:

postgres@pg-af3:/home/postgres/ [af] ps -ef | grep "postgres:"
postgres  5958  5955  0 14:15 ?        00:00:00 postgres: checkpointer
postgres  5959  5955  0 14:15 ?        00:00:00 postgres: background writer
postgres  5960  5955  0 14:15 ?        00:00:00 postgres: walwriter
postgres  5961  5955  0 14:15 ?        00:00:00 postgres: autovacuum launcher
postgres  5962  5955  0 14:15 ?        00:00:00 postgres: stats collector
postgres  5963  5955  0 14:15 ?        00:00:00 postgres: pg_auto_failover monitor
postgres  5964  5955  0 14:15 ?        00:00:00 postgres: logical replication launcher
postgres  5965  5955  0 14:15 ?        00:00:00 postgres: pg_auto_failover monitor worker
postgres  5966  5955  0 14:15 ?        00:00:00 postgres: pg_auto_failover monitor worker

The initialization of the monitor node also created a new database and two roles:

postgres@pg-af3:/home/postgres/ [af] psql postgres
psql (12.0 dbi services build)
Type "help" for help.

postgres=# \l
List of databases
Name       |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges
------------------+----------+----------+-------------+-------------+-----------------------
pg_auto_failover | autoctl  | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
postgres         | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
template0        | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
                 |          |          |             |             | postgres=CTc/postgres
template1        | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
                 |          |          |             |             | postgres=CTc/postgres

postgres=# \du
List of roles
Role name   |                         Attributes                         | Member of
--------------+------------------------------------------------------------+-----------
autoctl      |                                                            | {}
autoctl_node |                                                            | {}
postgres     | Superuser, Create role, Create DB, Replication, Bypass RLS | {}

What we got in the new database is the pgautofailover extension:

pg_auto_failover=# \dx
List of installed extensions
Name      | Version |   Schema   |         Description
----------------+---------+------------+------------------------------
pgautofailover | 1.0     | public     | pg_auto_failover
plpgsql        | 1.0     | pg_catalog | PL/pgSQL procedural language
(2 rows)

For our management kit to work properly a few PostgreSQL parameters will be set:

pg_auto_failover=# alter system set log_truncate_on_rotation = 'on';
ALTER SYSTEM
pg_auto_failover=# alter system set log_filename = 'postgresql-%a.log';
ALTER SYSTEM
pg_auto_failover=# alter system set log_rotation_age = '1440';
ALTER SYSTEM
pg_auto_failover=# alter system set log_line_prefix = '%m - %l - %p - %h - %u@%d - %x';
ALTER SYSTEM
pg_auto_failover=# alter system set log_directory = 'pg_log';
ALTER SYSTEM
pg_auto_failover=# alter system set log_min_messages = 'WARNING';
ALTER SYSTEM
pg_auto_failover=# alter system set log_autovacuum_min_duration = '60s';
ALTER SYSTEM
pg_auto_failover=# alter system set log_min_error_statement = 'NOTICE';
ALTER SYSTEM
pg_auto_failover=# alter system set log_min_duration_statement = '30s';
ALTER SYSTEM
pg_auto_failover=# alter system set log_checkpoints = 'on';
ALTER SYSTEM
pg_auto_failover=# alter system set log_statement = 'ddl';
ALTER SYSTEM
pg_auto_failover=# alter system set log_lock_waits = 'on';
ALTER SYSTEM
pg_auto_failover=# alter system set log_temp_files = '0';
ALTER SYSTEM
pg_auto_failover=# alter system set log_timezone = 'Europe/Zurich';
ALTER SYSTEM
pg_auto_failover=# alter system set log_connections=on;
ALTER SYSTEM
pg_auto_failover=# alter system set log_disconnections=on;
ALTER SYSTEM
pg_auto_failover=# alter system set log_duration=on;
ALTER SYSTEM
pg_auto_failover=# select pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)

What we need for the other nodes is the connection string to the monitoring node:

postgres@pg-af3:/home/postgres/ [af] pg_autoctl show uri
postgres://autoctl_node@pg-af3:5432/pg_auto_failover

Once we have that we can proceed with creating the master instance on the first host:

postgres@pg-af1:/home/postgres/ [pg120] unset PGDATABASE
postgres@pg-af1:/home/postgres/ [] sudo mkdir /u02/pgdata
postgres@pg-af1:/home/postgres/ [] sudo chown postgres:postgres /u02/pgdata
postgres@pg-af1:/home/postgres/ [] pg_autoctl create postgres --pgdata /u02/pgdata/12/PG1 --nodename pg-af1.it.dbi-services.com --monitor postgres://autoctl_node@pg-af3:5432/pg_auto_failover
INFO  Found pg_ctl for PostgreSQL 12.0 at /u01/app/postgres/product/12/db_0/bin/pg_ctl
INFO  Registered node pg-af1.it.dbi-services.com:5432 with id 1 in formation "default", group 0.
INFO  Writing keeper init state file at "/home/postgres/.local/share/pg_autoctl/u02/pgdata/12/PG1/pg_autoctl.init"
INFO  Successfully registered as "single" to the monitor.
INFO  Initialising a PostgreSQL cluster at "/u02/pgdata/12/PG1"
INFO  Postgres is not running, starting postgres
INFO   /u01/app/postgres/product/12/db_0/bin/pg_ctl --pgdata /u02/pgdata/12/PG1 --options "-p 5432" --options "-h *" --wait start
INFO  The user "postgres" already exists, skipping.
INFO  CREATE DATABASE postgres;
INFO  The database "postgres" already exists, skipping.
INFO  FSM transition from "init" to "single": Start as a single node
INFO  Initialising postgres as a primary
INFO  Transition complete: current state is now "single"
INFO  Keeper has been succesfully initialized.

The next step is to start the so called keeper process (this is the process which communicates with the montoring node about state changes):

postgres@pg-af1:/home/postgres/ [] pg_autoctl run --pgdata /u02/pgdata/12/PG1
INFO  Managing PostgreSQL installation at "/u02/pgdata/12/PG1"
INFO  The version of extenstion "pgautofailover" is "1.0" on the monitor
INFO  pg_autoctl service is starting
INFO  Calling node_active for node default/1/0 with current state: single, PostgreSQL is running, sync_state is "", current lsn is "0/0".
INFO  Calling node_active for node default/1/0 with current state: single, PostgreSQL is running, sync_state is "", current lsn is "0/0".
INFO  Calling node_active for node default/1/0 with current state: single, PostgreSQL is running, sync_state is "", current lsn is "0/0".
INFO  Calling node_active for node default/1/0 with current state: single, PostgreSQL is running, sync_state is "", current lsn is "0/0".

To integrate that into systemd:

postgres@pg-af2:/home/postgres/ [PG1] pg_autoctl show systemd
20:28:43 INFO  HINT: to complete a systemd integration, run the following commands:
20:28:43 INFO  pg_autoctl -q show systemd --pgdata "/u02/pgdata/12/PG1" | sudo tee /etc/systemd/system/pgautofailover.service
20:28:43 INFO  sudo systemctl daemon-reload
20:28:43 INFO  sudo systemctl start pgautofailover
[Unit]
Description = pg_auto_failover

[Service]
WorkingDirectory = /u02/pgdata/12/PG1
Environment = 'PGDATA=/u02/pgdata/12/PG1'
User = postgres
ExecStart = /u01/app/postgres/product/12/db_0/bin/pg_autoctl run
Restart = always
StartLimitBurst = 0

[Install]
WantedBy = multi-user.target

postgres@pg-af2:/home/postgres/ [PG1] pg_autoctl -q show systemd --pgdata "/u02/pgdata/12/PG1" | sudo tee /etc/systemd/system/pgautofailover.service
[Unit]
Description = pg_auto_failover

[Service]
WorkingDirectory = /u02/pgdata/12/PG1
Environment = 'PGDATA=/u02/pgdata/12/PG1'
User = postgres
ExecStart = /u01/app/postgres/product/12/db_0/bin/pg_autoctl run
Restart = always
StartLimitBurst = 0

[Install]
WantedBy = multi-user.target

postgres@pg-af2:/home/postgres/ [PG1] systemctl list-unit-files | grep pgauto
pgautofailover.service                      disabled
20:30:57 postgres@pg-af2:/home/postgres/ [PG1] sudo systemctl enable pgautofailover.service
Created symlink /etc/systemd/system/multi-user.target.wants/pgautofailover.service → /etc/systemd/system/pgautofailover.service.

If you are on CentOS/Red Hat 8 you will also need this as otherwise the service will not start:

postgres@pg-af1:/u01/app/postgres/local/dmk/ [PG1] sudo semanage fcontext -a -t bin_t /u01/app/postgres/product/12/db_0/bin/pg_autoctl
postgres@pg-af1:/u01/app/postgres/local/dmk/ [PG1] restorecon -v /u01/app/postgres/product/12/db_0/bin/pg_autoctl

After rebooting all the nodes (to confirm that the systemd service is working as expected) the state of the cluster reports one primary and a secondary/replica as expected:

postgres@pg-af3:/home/postgres/ [af] pg_autoctl show state
Name |   Port | Group |  Node |     Current State |    Assigned State
---------------------------+--------+-------+-------+-------------------+------------------
pg-af1.it.dbi-services.com |   5432 |     0 |     1 |           primary |           primary
pg-af2.it.dbi-services.com |   5432 |     0 |     2 |         secondary |         secondary

The various states are documented here.

Remember: As this is based on PostgreSQL 12 there will be no recovery.conf on the replica. The replication parameters have been added to postgresql.auto.conf automatically:

postgres@pg-af2:/u02/pgdata/12/PG1/ [PG1] cat postgresql.auto.conf
# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
primary_conninfo = 'user=pgautofailover_replicator passfile=''/home/postgres/.pgpass'' connect_timeout=5 host=''pg-af1.it.dbi-services.com'' port=5432 sslmode=prefer sslcompression=0 gssencmode=disable target_session_attrs=any'
primary_slot_name = 'pgautofailover_standby'

That’s it for the setup. Really easy and simple, I like it. In the next post we’ll have a look at controlled switch-overs and fail-over scenarios.

Cet article pg_auto_failover: Setup and installation est apparu en premier sur Blog dbi services.

Handling PostgreSQL installations from packages

$
0
0
In this blog I will show how to handle a PostgreSQL installation with a customized PGDATA using the packages provided by the PostgreSQL community.

One issue with the packages is the hard coded PGDATA, which will be overwritten in the Servicefile with each update of PostgreSQL. This blog entry based on PostgreSQL 12 with CentOS 7 and CentOS 8.

On a minimal installation in my mind a few things are missing, the net-tools package and nano as editor, I’m a friend of using nano instead of vi.

CentOS 7:

$ yum install net-tools
$ yum install nano

CentOS 8:

$ dnf install net-tools
$ dnf install nano

For using the PostgreSQL repository it is important to exclude PostgreSQL from the CentOS Repository.

By using CentOS 7 you need to edit the CentOS-Base repofile to exclude PostgreSQL from Base and Updates.

$ nano /etc/yum.repos.d/CentOS-Base.repo

# CentOS-Base.repo
#
# The mirror system uses the connecting IP address of the client and the
# update status of each mirror to pick mirrors that are updated to and
# geographically close to the client.  You should use this for CentOS updates
# unless you are manually picking other mirrors.
#
# If the mirrorlist= does not work for you, as a fall back you can try the
# remarked out baseurl= line instead.
#
#

[base]
name=CentOS-$releasever - Base
mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=os&infra=$infra
#baseurl=http://mirror.centos.org/centos/$releasever/os/$basearch/
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7
#exclude PostgreSQL from os repository 
exclude=postgresql* 

#released updates
[updates]
name=CentOS-$releasever - Updates
mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=updates&infra=$inf$
#baseurl=http://mirror.centos.org/centos/$releasever/updates/$basearch/
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7
#exclude PostgreSQL from os repository 
exclude=postgresql*

#additional packages that may be useful
[extras]
name=CentOS-$releasever - Extras
mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=extras&infra=$infra
#baseurl=http://mirror.centos.org/centos/$releasever/extras/$basearch/
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7

#additional packages that extend functionality of existing packages
[centosplus]
name=CentOS-$releasever - Plus
mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=centosplus&infra=$$
[ Read 46 lines ]

By using CentOS 8 it is just one command to exclude PostgreSQL from the distribution repository:

$ dnf -y module disable postgresql

Add PostgreSQL Repository to CentOS 7, in this example it is ProstgreSQL 12

$ yum install https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm

And the same for CentOS 8

$ dnf install https://download.postgresql.org/pub/repos/yum/reporpms/EL-8-x86_64/pgdg-redhat-repo-latest.noarch.rpm

Now it is time to install PostgreSQL 12 out of the PostgreSQL repository BUT NO INITDB at the moment.

CentOS 7:

$ yum install postgresql12 postgresql12-server postgresql12-contrib

CentOS 8:

$ dnf install postgresql12 postgresql12-server postgresql12-contrib

Now it is time to create the override file to the PostgreSQL Service file, the steps are identical on CentOS 7 and CentOS 8.

In my example PGDATA is in /pg_data/12/data mounted as own volume.

So edit the postgresql-12.service file with sysctl edit:

$ systemctl edit postgresql-12.service

And add the needed content for your customized PGDATA:

[Service]
Environment=PGDATA=/pg_data/12/data

Save the change, it will create a /etc/systemd/system/postgresql-12.service.d/override.conf file which will be merged with the original service file.

To check the content:

$ cat /etc/systemd/system/postgresql-12.service.d/override.conf
[Service]
Environment=PGDATA=/pg_data/12/data

Reload Systemd

$ systemctl daemon-reload

Hopefully your PGATA is owned by the postgres user if not make sure that it is:

$ chown -R postgres:postgres /pg_data/

Create the PostgreSQL instance as root user:

$ /usr/pgsql-12/bin/postgresql-12-setup initdb
Initializing database ... OK

Here it is:

[root@centos-8-blog /]# cd /pg_data/12/data/
[root@centos-8-blog data]# ls
base          pg_dynshmem    pg_multixact  pg_snapshots  pg_tblspc    pg_xact
global        pg_hba.conf    pg_notify     pg_stat       pg_twophase  postgresql.auto.conf
log           pg_ident.conf  pg_replslot   pg_stat_tmp   PG_VERSION   postgresql.conf
pg_commit_ts  pg_logical     pg_serial     pg_subtrans   pg_wal

From now on PostgreSQL minor updates will be done with yum update on CentOS 7 or dnf update on CentOS 8 in one step, no extra downtime for it.

But be careful, before running yum update or dnf update STOP ALL POSTGRESQL INSTANCES!

This is also working in environments with many instances, you need a service file and an override.conf for each instance, an additional instance needs to be created with initdb -D and not with PostgreSQL-12-setup initdb.

This method is also working with SLES 12.

 

Cet article Handling PostgreSQL installations from packages est apparu en premier sur Blog dbi services.

Create a Kubernetes cluster with Google Kubernetes Engine

$
0
0

Nowadays the market for cloud providers is very competitive. Large companies are fighting a very hard battle over the services they provide. Each offers a wide range of more or less identical products with specific features for each.

In my point of view, having deployed Kubernetes clusters in several environments (Cloud and On-Premise), I pay particular attention to Google Cloud for its Google Kubernetes Engine offer. The deployment of a Kubernetes cluster is very fast and allows us to have a test/production environment in a few minutes.

Therefore, in this blog post, we will explain how to create a Kubernetes cluster in Google Cloud with some useful additional resources.

Prerequisites

A Google account is needed. You can create one by following the sign-up link: https://cloud.google.com. Otherwise, you can use the free tier account: https://cloud.google.com/free/?hl=en.

Create your project

Go to the cloud portal through the following link: https://console.cloud.google.com/home/dashboard

The first step is the creation of a project. Before creating a resource, you will need to create a project in order to encapsulate all your resources within it. To properly create a project, follow the below steps:

Enter your project name and click on create:

After a few seconds, your project will be created, and you will have access to the home dashboard:

Create your cluster

Once the project is created and ready to use, let’s create now our Kubernetes cluster. Click on the Kubernetes Engine menu and clusters sub-menu to begin the creation process.

Once the Kubernetes Engine API is enabled, we can click on the create cluster button and configured our cluster as needed.

We choose a standard cluster with 3 cluster nodes. You can edit the resources of your cluster according to your needs. For our example, we kept the default configuration provided by the API.

Click on the create button and after a few minutes, your cluster is ready for usage.

Start using your Kubernetes cluster

Google SDK is needed to use your Kubernetes cluster in your favorite client platform. To install Google SDK follow the instructions here:

SDK Cloud is properly installed, we can now initialize our environment by the following steps:

mehdi@MacBook-Pro: gcloud init
Welcome! This command will take you through the configuration of gcloud.
 
Settings from your current configuration [default] are:
core:
  account: mehdi.bada68@gmail.com
  disable_usage_reporting: 'True'
  project: jx-k8s-2511
 
Pick configuration to use:
 [1] Re-initialize this configuration [default] with new settings
 [2] Create a new configuration
Please enter your numeric choice:  1
 
Your current configuration has been set to: [default]
 
You can skip diagnostics next time by using the following flag:
  gcloud init --skip-diagnostics
 
Network diagnostic detects and fixes local network connection issues.
Checking network connection...done.
Reachability Check passed.
Network diagnostic passed (1/1 checks passed).
 
Choose the account you would like to use to perform operations for
this configuration:
 [1] mehdi.bada68@gmail.com
 [2] Log in with a new account
Please enter your numeric choice:  1
 
You are logged in as: [mehdi.bada68@gmail.com].
 
Pick cloud project to use:
 [1] kubernetes-infra-258110
 [2] Create a new project
Please enter numeric choice or text value (must exactly match list
item):  1
 
Your current project has been set to: [kubernetes-infra-258110].
 
Do you want to configure a default Compute Region and Zone? (Y/n)?  Y
 
Which Google Compute Engine zone would you like to use as project
default?
If you do not specify a zone via a command-line flag while working
with Compute Engine resources, the default is assumed.
 
Please enter numeric choice or text value (must exactly match list
item):  8

Login now to gcloud :

mehdi@MacBook-Pro: gcloud auth login
… 
You are now logged in as [mehdi.bada68@gmail.com].
Your current project is [kubernetes-infra-258110].  You can change this setting by running:
  $ gcloud config set project PROJECT_ID

Update your ~./kube/config file with the credentials of the new cluster created before:

mehdi@MacBook-Pro: gcloud container clusters get-credentials standard-cluster-1
Fetching cluster endpoint and auth data.
kubeconfig entry generated for standard-cluster-1.

Your kubectl client is now connected to your remote GKE cluster.

mehdi@MacBook-Pro: kubectl get nodes -o wide
NAME                                                STATUS   ROLES    AGE   VERSION          INTERNAL-IP   EXTERNAL-IP     OS-IMAGE                             KERNEL-VERSION   CONTAINER-RUNTIME
gke-standard-cluster-1-default-pool-1ac453ab-6tj4   Ready       56m   v1.13.11-gke.9   10.128.0.3    34.70.191.147   Container-Optimized OS from Google   4.14.145+        docker://18.9.7
gke-standard-cluster-1-default-pool-1ac453ab-s242   Ready       56m   v1.13.11-gke.9   10.128.0.4    35.188.3.165    Container-Optimized OS from Google   4.14.145+        docker://18.9.7
gke-standard-cluster-1-default-pool-1ac453ab-w0j0   Ready       56m   v1.13.11-gke.9   10.128.0.2    34.70.107.231   Container-Optimized OS from Google   4.14.145+        docker://18.9.7

Deploy Kubernetes Dashboard

After configuring the kubectl client we can start deploying resources on the Kubernetes cluster. One of the most popular resources in Kubernetes is the dashboard. It allows users and admin having a graphical view of all cluster resources.

Download the dashboard deployment locally:

curl -o dashboard.yaml  https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-beta4/aio/deploy/recommended.yaml

Then apply the deployment:

mehdi@MacBook-Pro: kubectl apply -f dashboard.yaml
namespace/kubernetes-dashboard created
serviceaccount/kubernetes-dashboard created
service/kubernetes-dashboard created
secret/kubernetes-dashboard-certs created
secret/kubernetes-dashboard-csrf created
secret/kubernetes-dashboard-key-holder created
configmap/kubernetes-dashboard-settings created
role.rbac.authorization.k8s.io/kubernetes-dashboard created
clusterrole.rbac.authorization.k8s.io/kubernetes-dashboard created
rolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created
clusterrolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created
deployment.apps/kubernetes-dashboard created
service/dashboard-metrics-scraper created
deployment.apps/dashboard-metrics-scraper created

Create an admin Service Account and Cluster Role Binding that you can use to securely connect to the dashboard with admin-level permissions:

mehdi@MacBook-Pro: vi admin-sa.yaml 

apiVersion: v1
kind: ServiceAccount
metadata:
  name: admin
  namespace: kubernetes-dashboard
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: admin
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: admin
  namespace: kubernetes-dashboard

mehdi@MacBook-Pro: kubectl apply -f admin-sa.yaml
serviceaccount/admin created
clusterrolebinding.rbac.authorization.k8s.io/admin created

First, retrieve the authentication token for the admin service account, as below:

mehdi@MacBook-Pro: kubectl -n kubernetes-dashboard describe secret $(kubectl -n kubernetes-dashboard get secret | grep admin | awk '{print $1}')
Name:         admin-token-dpsl9
Namespace:    kubernetes-dashboard
Labels:       
Annotations:  kubernetes.io/service-account.name: admin
              kubernetes.io/service-account.uid: 888de3dc-ffff-11e9-b5ca-42010a800046

Type:  kubernetes.io/service-account-token

Data
====
ca.crt:     1119 bytes
namespace:  20 bytes
token:      eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJhZG1pbi10b2tlbi1kcHNsOSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50Lm5hbWUiOiJhZG1pbiIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6Ijg4OGRlM2RjLWZmZmYtMTFlOS1iNWNhLTQyMDEwYTgwMDA0NiIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDprdWJlcm5ldGVzLWRhc2hib2FyZDphZG1pbiJ9.DBrfylt1RFDpHEuTy4l0BY-kRwFqm9Tvfne8Vu-IZVghy87vVWtsCatjt2wzCtMjX-I5oB0YAYmio7pTwPV-Njyd_VvbWupqOF7yiYE72ZXri0liLnQN5qbtyOmswsjim0ehG_yQSHaAqp21cQdPXb59ItBLN7q0-dh8wBRyOMAVLttjbmzBb02XxtJlALYg8F4hAkyHjJAzHAyntMylUXyS2gn471WUYFs1usDDpA8uZRU3_K6oyccXa-xqs8kKRB1Ch6n4Cq9TeMKkoUyv0_alEEQvwkp_uQCl2Rddk7bLNnjfDXDPC9LXOT-2xfvUf8COe5dO-rUXemHJlhPUHw

Copy the token value.

Access to the Kubernetes dashboard using the kubectl proxy command line.

mehdi@MacBook-Pro: kubectl proxy
Starting to serve on 127.0.0.1:8001

The dashboard is now available in the following link: http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/#/login

Choose the token authentication and paste the value from the previous output.

You have now access to the Kubernetes dashboard and deployed your first Kubernetes resource!

Deploy an Ingress Load Balancer

In order to access your cluster service externally, we need to create an ingress load balancer for our GKE cluster. The ingress load balancer will make HTTP/HTTPS applications accessible publicly through the creation of an external IP address for the cluster.

Before creating the ingress, we need to deploy a test application for our example. Let’s deploy an NGINX server.

mehdi@MacBook-Pro: vi nginx-deployment.yaml

apiVersion: apps/v1beta2
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 2
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
        ports:
        - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  type: NodePort
  ports:
    - port: 80
  selector:
    app: nginx


mehdi@MacBook-Pro: kubectl apply -f nginx-deployment.yaml

deployment.apps/nginx-deployment unchanged
service/nginx created

Create the ingress resource and deploy it as following:

mehdi@MacBook-Pro: vi basic-ingress.yaml
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: basic-ingress
spec:
rules:
- http:
paths:
- backend:
serviceName: nginx
servicePort: 80

mehdi@MacBook-Pro: kubectl apply -f basic-ingress.yaml
ingress.extensions/basic-ingress created

Verify the status of the ingress:

mehdi@MacBook-Pro: kubectl get ing -o wide
NAME            HOSTS   ADDRESS         PORTS   AGE
basic-ingress   *       34.102.214.94   80      8h

The ingress resources have been properly created. We can see the result directly from the Google Cloud dashboard.

The NGINX service is now available via the Ingress Load Balancer and can be accessed through:

Cet article Create a Kubernetes cluster with Google Kubernetes Engine est apparu en premier sur Blog dbi services.

Galera Cluster 4 with MariaDB 10.4.8

$
0
0

Last month, by a new customer I had to install the latest version of the MariaDB server: 10.4.8 to setup a Galera Cluster with 3 master nodes.
The good news was that this version was shipped with the latest version of the Galera Plugin from Codership: Galera Cluster 4.0.
As usual, installation & configuration was quitte easy.

$ sudo yum -y install MariaDB-server
$ sudo yum list installed|grep -i mariadb
MariaDB-client.x86_64 10.4.8-1.el7.centos @mariadb-main
MariaDB-common.x86_64 10.4.8-1.el7.centos @mariadb-main
MariaDB-compat.x86_64 10.4.8-1.el7.centos @mariadb-main
MariaDB-server.x86_64 10.4.8-1.el7.centos @mariadb-main
galera.x86_64 26.4.2-1.rhel7.el7.centos @mariadb-main

New Features:

But now I just want to introduce some of the new interesting and high level features available in this version.

Streaming Replication

In the previous versions, when we had large and long-running write transactions, we had always conflicts during
the Certification Based Replication and because of this, often transactions were aborted and rolled back.
Some of our customers were really suffering and complaining because of this problem and limitation.
Now, when there will be a big transaction, the node who initiated the transaction will not have to wait till the “commit”
but will break it into fragments, will certify it and will replicate it on the other master nodes while still the transaction will be running.

New System Tables

When having a look to the mysql database, we can see 3 new system tables.
They are containing informations that are already in status variables but now they will be persistent:

MariaDB [mysql]> show tables from mysql like 'wsrep_%';
+---------------------------+
| Tables_in_mysql (wsrep_%) |
+---------------------------+
| wsrep_cluster             |
| wsrep_cluster_members     |
| wsrep_streaming_log.      |
+---------------------------+
3 rows in set (0.000 sec)

These 3 new tables should bring to the database administrators a better overview of the current status of the cluster.

MariaDB [mysql]> select * from wsrep_cluster;
+--------------------------------------+---------+------------+------------------+--------------+
| cluster_uuid                         | view_id | view_seqno | protocol_version | capabilities |
+--------------------------------------+---------+------------+------------------+--------------+
| 6c41b92b-e0f9-11e8-9924-de3112d0ce21 | 3       | 1967.      | 4.               | 184703       |
+--------------------------------------+---------+------------+------------------+--------------+
1 row in set (0.000 sec)

cluster_uuid is the uuid of the cluster, corresponding to the status variable: wsrep_cluster_state_uuid
view_id is the number of cluster configuration changes, corresponding to the status variable wsrep_cluster_conf_id
view_seqno is the latest Galera sequence number, corresponding to the status variable: wsrep_last_committed
protocol_version is the MariaDB wsrep patch version, corresponding to the status variable: wsrep_protocol_version
capabilities is the capabilities bitmask provided by the Galera library.

MariaDB [mysql]> select * from wsrep_cluster_members;
+--------------------------------------+--------------------------------------+-----------+-----------------------+
| node_uuid                            | cluster_uuid                         | node_name | node_incoming_address |
+--------------------------------------+--------------------------------------+-----------+-----------------------+
| 6542be69-ffd5-11e9-a2ed-a363df0547d5 | 6c41b92b-e0f9-11e8-9924-de3112d0ce21 | node1     | AUTO                  |
| 6ae6fec5-ffd5-11e9-bb70-da54860baa6d | 6c41b92b-e0f9-11e8-9924-de3112d0ce21 | node2.    | AUTO                  |
| 7054b852-ffd5-11e9-8a45-72b6a9955d28 | 6c41b92b-e0f9-11e8-9924-de3112d0ce21 | node3     | AUTO                  |
+--------------------------------------+--------------------------------------+-----------+-----------------------+
3 rows in set (0.000 sec)

This system table display the current membership of the cluster.
It contains a row for each node and member in the cluster.
node_uuid is the unique identifier of the master node.
cluster_uuid is the unique identifier of the cluster. It must be the same for all members.
node_name is explicit
node_incoming_address stores the IP address and port for client connections.

MariaDB [mysql]> select * from wsrep_streaming_log;
Empty set (0.000 sec)

This system table will contains rows only if there is a transaction which have the “streaming replication” enabled

Synchronization Functions

This new SQL functions can be used in wsrep synchronization operations.
It is possibble to use them to obtain the GTID (Global Transaction ID)
WSREP_LAST_SEEN_GTID(): returns the GTID of the last write transaction observed by the client
WSREP_LAST_WRITTEN_GTID(): returns the GTID of the last write transaction made by the client
WSREP_SYNC_WAIT_UPTO_GTID(): blocks the client until the node applies and commits the given transaction

Conclusion:

These new features and especially the streaming replication, which is really an improvement and a huge boost to large
transaction support, should bring to users and dba’s satisfaction and hopefully a better opinion of the MariaDB Galera cluster.
In another blog, I will try to demonstrate how this streaming replication works.

Cet article Galera Cluster 4 with MariaDB 10.4.8 est apparu en premier sur Blog dbi services.

How to scale up a Patroni cluster

$
0
0

During the preparation of my presentation for the pgconf.eu I ran into one big issue. I had to stop my cluster to add a new node. That was not the way I wanted to archive this. I want a high availability solution, that can be scaled up without any outage. Due to a little hint during the pgconf.eu I was able to find a solution. In this post I will show the manually scale up, without using a playbook.

Starting position

We start with a 3 node patroni cluster which can be created using this blog post.
Now we want to add a fourth node to the existing etcd and Patroni cluster. In case you also need a playbook to install a forth node, check out my GitHub repository.

Scale up the etcd cluster

This step is only needed, when you want to scale up your etcd cluster as well. To scale up a Patroni cluster it is not necessary to scale up etcd cluster. You can, of course, scale up Patroni without adding more etcd cluster members. But maybe someone also needs to scale up his etcd cluster and searches for a solution. If not, just jump to the next step.

Be sure the etcd and patroni service are not started on the forth node.

postgres@patroni4:/home/postgres/ [PG1] systemctl status etcd
● etcd.service - dbi services etcd service
   Loaded: loaded (/etc/systemd/system/etcd.service; enabled; vendor preset: disabled)
   Active: inactive (dead)
postgres@patroni4:/home/postgres/ [PG1] systemctl status patroni
● patroni.service - dbi services patroni service
   Loaded: loaded (/etc/systemd/system/patroni.service; enabled; vendor preset: disabled)
   Active: inactive (dead)
postgres@patroni4:/home/postgres/ [PG1]

Make the following adjustments in the etcd.conf of the 4th node.

postgres@patroni4:/home/postgres/ [PG1] cat /u01/app/postgres/local/dmk/etc/etcd.conf
name: patroni4
data-dir: /u02/pgdata/etcd
initial-advertise-peer-urls: http://192.168.22.114:2380
listen-peer-urls: http://192.168.22.114:2380
listen-client-urls: http://192.168.22.114:2379,http://localhost:2379
advertise-client-urls: http://192.168.22.114:2379
initial-cluster-state: 'existing'
initial-cluster: patroni1=http://192.168.22.111:2380,patroni2=http://192.168.22.112:2380,patroni3=http://192.168.22.113:2380,patroni4=http://192.168.22.114:2380

Next add the new etcd member to the existing etcd cluster. You can execute that on every existing member of the cluster.

postgres@patroni1:/home/postgres/ [PG1] etcdctl member add patroni4 http://192.168.22.114:2380
Added member named patroni4 with ID dd9fab8349b3cfc to cluster

ETCD_NAME="patroni4"
ETCD_INITIAL_CLUSTER="patroni4=http://192.168.22.114:2380,patroni1=http://192.168.22.111:2380,patroni2=http://192.168.22.112:2380,patroni3=http://192.168.22.113:2380"
ETCD_INITIAL_CLUSTER_STATE="existing"

Now you can start the etcd service on the 4th node.

postgres@patroni4:/home/postgres/ [PG1] sudo systemctl start etcd
postgres@patroni4:/home/postgres/ [PG1] systemctl status etcd
● etcd.service - dbi services etcd service
   Loaded: loaded (/etc/systemd/system/etcd.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2019-10-17 16:39:16 CEST; 9s ago
 Main PID: 8239 (etcd)
   CGroup: /system.slice/etcd.service
           └─8239 /u01/app/postgres/local/dmk/bin/etcd --config-file /u01/app/postgres/local/dmk/etc/etcd.conf
postgres@patroni4:/home/postgres/ [PG1]

And after a short check, we can see, that Node 4 is added to the existing cluster

postgres@patroni4:/home/postgres/ [PG1] etcdctl member list
dd9fab8349b3cfc: name=patroni4 peerURLs=http://192.168.22.114:2380 clientURLs=http://192.168.22.114:2379 isLeader=false
16e1dca5ee237693: name=patroni1 peerURLs=http://192.168.22.111:2380 clientURLs=http://192.168.22.111:2379 isLeader=false
28a43bb36c801ed4: name=patroni2 peerURLs=http://192.168.22.112:2380 clientURLs=http://192.168.22.112:2379 isLeader=false
5ba7b55764fad76e: name=patroni3 peerURLs=http://192.168.22.113:2380 clientURLs=http://192.168.22.113:2379 isLeader=true

Scale up Patroni

Scale up the Patroni cluster is also really easy.
Adjust the host entry in the patroni.yml on the new node.

postgres@patroni4:/home/postgres/ [PG1] cat /u01/app/postgres/local/dmk/etc/patroni.yml | grep hosts
  hosts: 192.168.22.111:2379,192.168.22.112:2379,192.168.22.113:2379,192.168.22.114:2379

Afterwards, start the Patroni service.

postgres@patroni4:/home/postgres/ [PG1] sudo systemctl start patroni
postgres@patroni4:/home/postgres/ [PG1] systemctl status patroni
● patroni.service - dbi services patroni service
   Loaded: loaded (/etc/systemd/system/patroni.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2019-10-17 17:03:19 CEST; 5s ago
  Process: 8476 ExecStartPre=/usr/bin/sudo /bin/chown postgres /dev/watchdog (code=exited, status=0/SUCCESS)
  Process: 8468 ExecStartPre=/usr/bin/sudo /sbin/modprobe softdog (code=exited, status=0/SUCCESS)
 Main PID: 8482 (patroni)
   CGroup: /system.slice/patroni.service
           ├─8482 /usr/bin/python2 /u01/app/postgres/local/dmk/bin/patroni /u01/app/postgres/local/dmk/etc/patroni.yml
           ├─8500 /u01/app/postgres/product/11/db_5/bin/postgres -D /u02/pgdata/11/PG1/ --config-file=/u02/pgdata/11/PG1/postgresql.conf --listen_addresses=192.168.22.114 --max_worker_processes=8 --max_locks_per_transact...
           ├─8502 postgres: PG1: logger
           ├─8503 postgres: PG1: startup   waiting for 000000020000000000000006
           ├─8504 postgres: PG1: checkpointer
           ├─8505 postgres: PG1: background writer
           ├─8506 postgres: PG1: stats collector
           ├─8507 postgres: PG1: walreceiver
           └─8513 postgres: PG1: postgres postgres 192.168.22.114(48882) idle

To be sure, everything runs correctly, check the status of the Patroni cluster

postgres@patroni4:/home/postgres/ [PG1] patronictl list
+---------+----------+----------------+--------+---------+----+-----------+
| Cluster |  Member  |      Host      |  Role  |  State  | TL | Lag in MB |
+---------+----------+----------------+--------+---------+----+-----------+
|   PG1   | patroni1 | 192.168.22.111 |        | running |  2 |       0.0 |
|   PG1   | patroni2 | 192.168.22.112 |        | running |  2 |       0.0 |
|   PG1   | patroni3 | 192.168.22.113 | Leader | running |  2 |       0.0 |
|   PG1   | patroni4 | 192.168.22.114 |        | running |  2 |       0.0 |
+---------+----------+----------------+--------+---------+----+-----------+

Conclusion

Using the playbooks had one failure. The entry for host in the patroni.yml is only checking localhost. When starting the fourth node, Patroni is not looking for all the other hosts, it is just looking for its own availability. This works fine in an initial cluster, but not when you want to extended one.
And: Always keep in mind, you need an uneven number of members for an etcd cluster, don’t add only a forth etcd node.

Cet article How to scale up a Patroni cluster est apparu en premier sur Blog dbi services.

Viewing all 2851 articles
Browse latest View live