Quantcast
Channel: dbi Blog
Viewing all 2844 articles
Browse latest View live

Migrating your users from md5 to scram authentication in PostgreSQL

$
0
0

One of the new features in PostgreSQL 10 was the introduction of stronger password authentication based on SCRAM-SHA-256. How can you migrate your existing users that currently use md5 authentication to the new method without any interruption? Actually that is quite easy, as you will see in a few moments, but there is one important point to consider: Not every client/driver does already support SCRAM-SHA-256 authentication so you need to check that before. Here is the list of the drivers and their support for SCRAM-SHA-256.

The default method that PostgreSQL uses to encrypt password is defined by the “password_encryption” parameter:

postgres=# show password_encryption;
 password_encryption 
---------------------
 md5
(1 row)

Let’s assume we have a user that was created like this in the past:

postgres=# create user u1 login password 'u1';
CREATE ROLE

With the default method of md5 the hashed password looks like this:

postgres=# select passwd from pg_shadow where usename = 'u1';
               passwd                
-------------------------------------
 md58026a39c502750413402a90d9d8bae3c
(1 row)

As you can see the hash starts with md5 so we now that this hash was generated by the md5 algorithm. When we want this user to use scram-sha-256 instead, what do we need to do? The first step is to change the “password_encryption” parameter:

postgres=# alter system set password_encryption = 'scram-sha-256';
ALTER SYSTEM
postgres=# select pg_reload_conf();
 pg_reload_conf 
----------------
 t
postgres=# select current_setting('password_encryption');
 current_setting 
-----------------
 scram-sha-256
(1 row)

From now on the server will use scram-sha-256 and not anymore md5. But what happens when our user wants to connect to the instance once we changed that? Currently this is defined in pg_hba.conf:

postgres=> \! grep u1 $PGDATA/pg_hba.conf
host    postgres        u1              192.168.22.1/24         md5

Even though the default is not md5 anymore the user can still connect to the instance because the password hash did not change for that user:

postgres=> \! grep u1 $PGDATA/pg_hba.conf
host    postgres        u1              192.168.22.1/24         md5

postgres@rhel8pg:/home/postgres/ [PGDEV] psql -h 192.168.22.100 -p 5433 -U u1 postgres
Password for user u1: 
psql (13devel)
Type "help" for help.

postgres=> 

Once the user changed the password:

postgres@rhel8pg:/home/postgres/ [PGDEV] psql -h 192.168.22.100 -p 5433 -U u1 postgres
Password for user u1: 
psql (13devel)
Type "help" for help.

postgres=> \password
Enter new password: 
Enter it again: 
postgres=> 

… the hash of the new password is not md5 but SCRAM-SHA-256:

postgres=# select passwd from pg_shadow where usename = 'u1';
                                                                passwd                               >
----------------------------------------------------------------------------------------------------->
 SCRAM-SHA-256$4096:CypPmOW5/uIu4NvGJa+FNA==$PNGhlmRinbEKaFoPzi7T0hWk0emk18Ip9tv6mYIguAQ=:J9vr5CQDuKE>
(1 row)

One could expect that from now on the user is not able to connect anymore as we did not change pg_hba.conf until now:

postgres@rhel8pg:/home/postgres/ [PGDEV] psql -h 192.168.22.100 -p 5433 -U u1 postgres
Password for user u1: 
psql (13devel)
Type "help" for help.

postgres=> 

But in reality that still works as the server now uses the SCRAM-SHA-256 algorithm. So once all the users changed their passwords you can safely switch the rule in pg_hba.conf and you’re done:

postgres=> \! grep u1 $PGDATA/pg_hba.conf
host    postgres        u1              192.168.22.1/24         scram-sha-256

postgres=# select pg_reload_conf();
 pg_reload_conf 
----------------
 t
(1 row)

You just need to make sure that all the users do not have a hash starting with md5 but the new one starting with SCRAM-SHA-256.

Cet article Migrating your users from md5 to scram authentication in PostgreSQL est apparu en premier sur Blog dbi services.


Microsoft Inspire Partner Event 2019 / Day 1

$
0
0

dbi services at Microsoft Inspire Conference

Summary of my first participation – Day 1

After 20 years in the business of IT Infrastructure, time has come to open my mind 😉 and Microsoft Inspire is the perfect opportunity. It is a partner conference that has quite a different approach compared to other software editors I worked for (e.g. Oracle). Indeed, the Oracle Open World mixes partners and customers, which is not always a good idea since the messages may vary for these different stakeholders.

Core Session of the Day

Microsoft Inspire Conference at T-Mobile Arena

The day started in the Las Vegas T-Mobile Arena with the first Core Session opened by Gabriella Schuster (Corporate Vice President of Microsoft’s Worldwide Partner Group (WPG)). She claimed that she works for a company that has strong values such as listening. As a result, they listened to their partners and cancelled their trial to charge Microsoft products to Silver and Gold partners all over the world . That was an exciting announcement applauded by all partners. Starting a partner conference with such a news make things obviously easier 😉. We may thus raise a slight doubt about the timing of this warmly welcomed announcement.

Core Session with Gabriella Schuster (Corporate VP of Microsoft’s Worldwide Partner Group)

Microsoft’s focus is set on users

The way Microsoft organizes its conferences is quite different from what I have seen the past 20 years. At Microsoft, the focus is mainly set on the power of the users and the way they can easily collaborate with their peers within an organization.

The goal is not to promote an incredible back office solution or technology. The goal is to focus on what the user can achieve within his/her company. We really feel this difference. The back-end technologies are only available to reinforce the power of « end users », as Microsoft calls them (as opposed to simple « customers »). The focus is clearly set on Collaboration (Teams & Office 365). Azure and SQL Server are just « back-end means » to achieve a better collaboration. However, it doesn’t mean that there is no need for expertise on those layers 😉.

The way Microsoft presents its technologies is therefore really interesting. A use-case presented by Judson Althoff (Executive Vice President, Worldwide Commercial Business) about Unilever was promoted during the Core Session. Almost every Microsoft service (IoT, Azure, Teams, …) used by the customer was presented. The focus was set on how the customer improved its business, made it more efficient, more robust and even more compliant with environmental requirements (carbon-neutral production). Once again, a huge difference with Back-End focused technological companies.

As a logical « consequence » of the statements made during the Core Session, several presentations explained how Microsoft considers and addresses the security topics within the Cloud. Also there, we see the CEO’s  footprint in pushing Azure, Open Source and Secure IT transition at Microsoft. Since the company focuses on Trust and on Partners, securing the customers become has become a core component of the business. This hasn’t been always a priority for Microsoft and has thus changed a lot in the past months.

To me, Microsoft is not « cool again » (Gartner analysis), it has become cool 😉. The number of sessions claiming that Microsoft embraces Open Source software is a further evidence.

Link to the Core Session video

Keywords are : Collaboration, trust and security

User collaboration is a core strategy for Microsoft for which they have a strong and efficient weapon : Microsoft Teams. A product with the fastest user adoption ever since the company existed. More then 13 millions of users within two years. Thus, a lot of 20-minute sessions and demos on this topic were presented and Gabriella Schuster  put a strong focus on it during the Core session. Beside and below Teams, there is Office 365 and Sharepoint. Below Sharepoint, there is SQL Server. This summarizes the whole Microsoft strategy : User first, then Back-End. To start with Teams, an organization must be ready to embrace change to create value. The solution has a strong technical advantages. However, to promote it within an organization, the latter must be open to change and its users must be ready to change the way they used to work and collaborate.

End-users and collaboration are the key

How about DevOps at Microsoft?

As a former Infrastructure engineer, I had to follow the sessions about DevOps at Microsoft (have a look below to understand the Microsoft’s DevOps definition). As a software editor, Microsoft obviously had to embrace those practices for their own software.

Definition of DevOps at Microsoft

SQL Server: dbi services’ core business

As a founder of dbi services, I couldn’t attend this event without following at least one SQL Server session. Jack Tang, Director of Product Marketing SQL Server at Microsoft Corporate, presented the way to shift from “on prem SQL Server” databases to PaaS services within Azure. SSAS, SSIS and SSRS related SQL Server workloads may be the easier type of workloads to move to the PaaS Azure Cloud services. Mr. Tang also presented some quick paths to prioritize and perform such lifts. He estimates the duration of such projects between 2.5 and 9 months.

SQL Server is part of dbi services' expertise

To conclude this post, it was an intensive and really interesting day and I am looking forward to the coming ones 👍.

Cet article Microsoft Inspire Partner Event 2019 / Day 1 est apparu en premier sur Blog dbi services.

dbi services at Microsoft Inspire Conference 2019

$
0
0

SQL Server 2019 is in “public preview” stage, live from Microsoft Inspire 2019 (Day 2)

Anshul Rampal (SQL Server product marketing team) and Rony Chatterjee (SQL Server Core team / Product Manager) presented, during a 20 minutes session, the new SQL Server 2019 features.

SQL Server is 25 years old ! Its new version, SQL Server 2019, has been published last night (released as “public preview”). @David Barbarin, I got the information during the session 😉, sorry to take the priority.

The main milestone for me as an Open Source promoter is the further support of Linux and containers. Daniel Westermann when do you start SQL Server support with your team 😉 ?. To support Linux, an additional level of abstraction has been implemented within the core architecture. Furthermore Apache Spark is now delivered within the engine, to allow the treatment of any kind of data (normalized or not). Spark will be more and more integrated as a data treatment engine, beside or even within the SQL engine.

A huge amount of customers are still running SQL Server 2008, there is a huge amount of work and projects for partners over the coming years.

Microsoft wants SQL Server to be the Data Hub of the future through Data Virtualization (keeping the data where it is). The 20 minutes session during the conference focused on this feature.

See my previous blog post , Microsoft definitely goes Open Source : support of Linux , support of OpenShift, integration of Spark in order to managed non-relational data, and so on … ! Remember : “Microsoft becomes cool again” 😉

Oracle optimised its SQL optimiser 😉, Microsoft did the same through the “Intelligent Query processing”, even some recommendations will be performed by the optimiser.

SQL Server 2019 fully supports container and even High Availability may be provided through Kubernetes. Last but not the least, SQL Server 2019 is full hybrid, it may run partially “on prem” and partially into the Cloud. Through this feature and the Data Virtualization, SQL Server becomes the strategic company Data Hub. A Demo show how easy it will be to integrate any data source from any DataSource (i.e Oracle).

The picture shows a nice data visualisation demo, quick but really impressive. Well done Microsoft ! 👍

Cet article dbi services at Microsoft Inspire Conference 2019 est apparu en premier sur Blog dbi services.

Email Spoofing

$
0
0

Have you ever had this unhealthy sensation of being accused of facts that do not concern you? To feel helpless in the face of an accusing mail, which, because of its imperative and accusing tone, has the gift of throwing us the opprobrium?

This is the purpose of this particular kind of sextortion mail that uses spoofing, to try to extort money from you. A message from a supposed “hacker” who claims to have hacked into your computer. He threatens you with publishing compromising images taken without your knowledge with your webcam and asks you for a ransom in virtual currency most of the time.

Something like that:

 

Date:  Friday, 24 May 2019 at 09:19 UTC+1
Subject: oneperson
Your account is hacked! Renew the pswd immediately!
You do not heard about me and you are definitely wondering why you’re receiving this particular electronic message, proper?
I’m ahacker who exploitedyour emailand digital devicesnot so long ago.
Do not waste your time and make an attempt to communicate with me or find me, it’s not possible, because I directed you a letter from YOUR own account that I’ve hacked.
I have started malware to the adult vids (porn) site and suppose that you watched this website to enjoy it (you understand what I mean).
Whilst you have been keeping an eye on films, your browser started out functioning like a RDP (Remote Control) that have a keylogger that gave me authority to access your desktop and camera.
Then, my softaquiredall data.
You have entered passcodes on the online resources you visited, I intercepted all of them.
Of course, you could possibly modify them, or perhaps already modified them.
But it really doesn’t matter, my app updates needed data regularly.
And what did I do?
I generated a reserve copy of every your system. Of all files and personal contacts.
I have managed to create dual-screen record. The 1 screen displays the clip that you were watching (you have a good taste, ha-ha…), and the second part reveals the recording from your own webcam.
What exactly must you do?
So, in my view, 1000 USD will be a reasonable amount of money for this little riddle. You will make the payment by bitcoins (if you don’t understand this, search “how to purchase bitcoin” in Google).
My bitcoin wallet address:
1816WoXDtSmAM9a4e3HhebDXP7DLkuaYAd
(It is cAsE sensitive, so copy and paste it).
Warning:
You will have 2 days to perform the payment. (I built in an exclusive pixel in this message, and at this time I understand that you’ve read through this email).
To monitorthe reading of a letterand the actionsin it, I utilizea Facebook pixel. Thanks to them. (Everything thatis usedfor the authorities may helpus.)

In the event I do not get bitcoins, I shall undoubtedly give your video to each of your contacts, along with family members, colleagues, etc?

 

Users who are victims of these scams receive a message from a stranger who presents himself as a hacker. This alleged “hacker” claims to have taken control of his victim’s computer following consultation of a pornographic site (or any other site that morality would condemn). The cybercriminal then announces having compromising videos of the victim made with his webcam. He threatens to publish them to the victim’s personal or even professional contacts if the victim does not pay him a ransom. This ransom, which ranges from a few hundred to several thousand dollars, is claimed in a virtual currency (usually in Bitcoin but not only).

To scare the victim even more, cybercriminals sometimes go so far as to write to the victim with his or her own email address, in order to make him or her believe that they have actually taken control of his or her account. 

First of all, there is no need to be afraid of it. Indeed, if the “piracy” announced by cybercriminals is not in theory impossible to achieve, in practice, it remains technically complex and above all time-consuming to implement. Since scammers target their victims by the thousands, it can be deduced that they would not have the time to do what they claim to have done. 

These messages are just an attempt at a scam. In other words, if you receive such a blackmail message and do not pay, nothing more will obviously happen. 

Then, no need to change your email credentials. Your email address is usually something known and already circulates on the Internet because you use it regularly on different sites to identify and communicate. These sites have sometimes resold or exchanged their address files with different partners more or less scrupulous in marketing objectives.

If cybercriminals have finally written to you with your own email address to make you believe that they have taken control of it: be aware that the sender’s address in a message is just a simple display that can very easily be usurped without having to have a lot of technical skills. 

In any case, the way to go is simple: don’t panic, don’t answer, don’t pay, just throw this mail in the trash (and don’t forget to empty it regularly). 

On the mail server side, setting up certain elements can help to prevent this kind of mail from spreading in the organization. This involves deploying the following measures on your mail server:

  •       SPF (Sender Policy Framework): This is a standard for verifying the domain name of the sender of an email (standardized in RFC 7208 [1]). The adoption of this standard is likely to reduce spam. It is based on the SMTP (Simple Mail Transfer Protocol) which does not provide a sender verification mechanism. SPF aims to reduce the possibility of spoofing by publishing a record in the DNS (Domain Name Server) indicating which IP addresses are allowed or forbidden to send mail for the domain in question.
  •         DKIM (DomainKeys Identified Mail): This is a reliable authentication standard for the domain name of the sender of an email that provides effective protection against spam and phishing (standardized in RFC 6376 [2]). DKIM works by cryptographic signature, verifies the authenticity of the sending domain and also guarantees the integrity of the message.
  •       DMARC (Domain-based Message Authentication, Reporting and Conformance): This is a technical specification to help reduce email misuse by providing a solution for deploying and monitoring authentication issues (standardized in RFC 7489 [3]). DMARC standardizes the way how recipients perform email authentication using SPF and DKIM mechanisms.

 

REFERENCES

[1] S. Kitterman, “Sender Policy Framework (SPF),” ser. RFC7208, 2014, https://tools.ietf.org/html/rfc7208 [2] D. Crocker, T. Hansen, M. Kucherawy, “DomainKeys Identified Mail (DKIM) Signatures” ser. RFC6376, 2011,  https://tools.ietf.org/html/rfc6376 [3] M. Kuchewary, E. Zwicky, “Domain-based Message Authentication, Reporting and Conformance (DMARC)”, ser. RFC7489, 2015, https://tools.ietf.org/html/rfc7489

Cet article Email Spoofing est apparu en premier sur Blog dbi services.

Schedule reboots of your AWS instances and how that can result in a hard reboot and corruption

$
0
0

From time to time you might require to reboot your AWS instances. Maye you applied some patches or for whatever reason. Rebooting an AWS instance can be done in several ways: You can of course do that directly from the AWS console. You can use the AWS command line utilities as well. If you want to schedule a reboot you can either do that using CloudWatch or you can use SSM Maintenance Windows for that. In this post we will only look at CloudWatch and System Manager as these two can be used to schedule the reboot easily using AWS native utilities. You could, of course, do that as well by using cron and the AWS command line utilities but this is not the scope of this post.

For CloudWatch the procedure for rebooting instances is the following: Create a new rule:

Go for “Schedule” and give a cron expression. In this case it means: 16-July-2019 at 07:45. Select the “EC2 RebootInstances API call” and provide the instance IDs you want to have rebooted. There is one limitation: You can only add up to five targets. If you need more then you have to use System Manager as described later in this post. You should pre-create an IAM role with sufficient permissions which you can use for this as otherwise a new one will be created each time.

Finally give a name and a description, that’s it:


Once time reaches your cron expression target the instance(s) will reboot.

The other solution for scheduling stuff against many instances is to use AWS SSM. It requires a bit more preparation work but in the end this is the solution we decided to go for as more instances can be scheduled with one maintenance window (up to 50) and you could combine several tasks, e.g. executing something before doing the reboot and do something else after the reboot.

The first step is to create a new maintenance window:

Of course it needs a name and an optional description:

Again, in this example, we use a cron expression for the scheduling (some as above in the CloudWatch example). Be aware that this is UTC time:

Once the maintenance window is created we need to attach a task to it. Until now we only specified a time to run something but we did not specify what to run. Attaching a task can be done in the task section of the maintenance window:

In this case we go for an “Automation task”. Name and description are not required:

The important part is the document to run, in our case it is “AWS-RestartEC2Instance”:

Choose the instances you want to run the document against:

And finally specify the concurrency and error count and again, an IAM role with sufficient permissions to perform the actions defined in the document:

Last, but not least, specify a pseudo parameter called “{TARGET_ID}” which will tell AWS SSM to run that against all the instances you selected in the upper part of the screen:

That’s it. Your instances will be rebooted at the time you specified in the cron expression. All fine and easy and you never have to worry about scheduled instance reboots. Just adjust the cron expression and maybe the list of instances and you are done for the next scheduled reboot. Really? We did it like that against 100 instances and we got a real surprise. What happened? Not many, but a few instances have been rebooted hard and one of them even needed to be restored afterwards. Why that? This never happened in the tests we did before. When an instance does not reboot within 4 minutes AWS performs a hard reboot. This can lead to corruption as stated here. When you have busy instances at the time of the reboot this is not what you want. On Windows you get something like this:

You can easily reproduce that by putting a Windows system under heavy load with a cpu stress test and then schedule a reboot as described above.

In the background the automation document calls aws:changeInstanceState and that comes with a force parameter:

… and here we have it again: Risk of corruption. When you take a closer look at the automation document that stops an EC2 instance you can see that as well:

So what is the conclusion of all this? It is not to blame AWS for anything, all is documented and works as documented. Testing in a test environment does not necessarily mean it works on production as well. Even if it is documented you might not expect it because your tests went fine and you missed that part of the documentation where the behavior is explained. AWS System Manager still is a great tool for automating tasks but you really need to understand what happens before implementing it in production. And finally: Working on public clouds make many things easier but others harder to understand and troubleshoot.

Cet article Schedule reboots of your AWS instances and how that can result in a hard reboot and corruption est apparu en premier sur Blog dbi services.

Microsoft Inspire Partner Event 2019 / Day 3

$
0
0

dbi services at Microsoft Inspire Conference

Core Note Satya Nadella – Day 3

The day started with the Core Note from Satya Nadella. Being alone at the conference allowed me to find one of the very last well-located seat to have a nice view of the Event 😉.

I never saw a key note from Satya and I must admit that I was impressed. Giving the opportunity to lead a company because someone deserves it and not because he has some « favorable origins » should be respected. Well done Microsoft. Neither born in the U.S, nor rewarded by any University of the Ivy League, Satya Nadella really inspires us and deserves his position within the company.

This promotion should also « inspire » some Western European companies which are sometimes more « conservative » in the organization of their management.

Satya presented all products Microsoft is working on with the main focus on making the world, the people, the companies or the government better, more efficient and more successful.

SQL improvement with HyperScale / Not only for SQL Server !

The Hyper Scale SQL feature for Azure has been promoted. HyperScale will support SQL Server, PostgreSQL and Cosmos DB (one of the future strategic data platform for Microsoft). Hyper Scale benefits from several innovation such as the new SQL abstraction layers (within SQL Server) and from several caching levels. The whole backup & recovery strategies is based on snapshots. Even full restores are claimed to be performed in less than 10 minutes! Database business is strategic for Microsoft but also for dbi services, stay tuned!

Since I’m not a Microsoft SQL Server expert, I cannot state since when customers have this opportunity. However, it has been reminded several times that customers can « transfer » their Microsoft On-Prem licenses into Azure, at least for SQL Server. This fully makes sense.

Microsoft does not only produce database products 😉

Besides this announcement, the hololens, Microsoft Dynamics 365, and of course Teams were presented to make FY2020 as successful as the previous year. Teams is really becoming a new strategical product for Microsoft and we saw some impressive collaboration demonstration. Teams bases on SharePoint, on Azure, which both settle on SQL Server. Microsoft promoted one of its strength : products are all well integrated.

Security , Trust and … Security

At the end of the key note, Brad Smith (Chief Legal Officer) presented other strategic topics for Microsoft such as security. He claimed that Microsoft is strongly concerned by cyber security and wants to deploy worldwide the rules defined within the GRPR.

Once again, it’s all about trust. The Cloud war will be won by the ones who win the trust of the customers.

I come back to the sentence of the year, according to me, pronounced by Satya during his Core Note : « You cannot claim for trust, you have to earn it day after day ». I can only confirm that I’m fully aligned with this statement. It’s even more true in the Cyber security business.

Real Azure migration return of experience

I had the opportunity to attend several sessions like successful Azure migration projects.

A Microsoft partner has developed a Cloud Campaign and a Cloud strategy to help customers migrate to Azure. They developed two products addressed to the Cloud.

As a best practice, an assessment of the customers’ landscape is needed. A global picture will allow to understand the overall context in terms of quality and quantity (which applications are running, how have they been built, etc.).

Usually after half a year of “rehosting” in the cloud, the customer is unhappy because of the cost. Once migrated to the Cloud, the customer has to modernize its landscape to benefit fully from Cloud.

It’s important to know about all available XaaS Azure services to give the best advice to the customer. This partner is also using Infrastructure as a Code to optimize the deployment through Teraform. Discussions with customers should focus on Business Value, Cloud benefits and, as a further trend : Enterprise DevOps.

While transferring larger infrastructures on the cloud, according to this partner, DevOps becomes a must and requirement to benefit from Cloud advantages.

Business with Red Hat and Microsoft Azure

During a panel session, James Read (EMEA Principal Solution Architect – Red Hat) confirmed that the Microsoft / Red Hat partnership is really successful and strong and improved a lot since the first initiatives in November 2015.

As a consequence, Red Hat Linux definitively became a core Operating System of the Azure IaaS offering, and the both companies further develop their common offering. Of course, OpenShift is also fully available on the Azure Cloud which seems to be the platform of choice for this container.

The collaboration goes further with Red Hat. It will concern the other stacks of the « Red company » like JBOSS for instance.

According to some discussions during the panel, Windows should be supported as a container in OpenShift quite soon…

Azure Continual Service Improvement & Mea Culpa

Three speakers from Microsoft, involved in the Azure promotion and development, presented how they constantly improve the Azure services (availability & security).

They transparently presented two outages of the Active Directory Azure service in November 2018. They then explained why it happened, what they learned and what they improved to avoid such outages again. At Microsoft, transparency becomes part of the culture.

Conclusion

The conclusion of these days within the Microsoft ecosystem confirmed  what I’ve heard about Microsoft over the last months. They really stopped fighting against competitors and are involving them in their offering. Microsoft is not afraid of Open Source. It makes the best possible use of it within its Cloud and such a strategical shift is hugely positive for the company.

Microsoft becomes cool (again) 😉. And if you want to join one of the coolest Microsoft Team, have a look at this.

Cet article Microsoft Inspire Partner Event 2019 / Day 3 est apparu en premier sur Blog dbi services.

Documentum – D2+Pack Plugins not installed correctly

$
0
0

In a previous blog, I explained how D2 can be installed in silent. In this blog, I will talk about a possible issue that might happen when doing so with the D2+Pack Plugins that aren’t being installed, even if you ask D2 to install them and while there is no message or no errors related to this issue. The first time I had this issue, it was several years ago but I never blogged about it. I faced it again recently so I thought I would this time.

So first, let’s prepare the D2 and D2+Pack packages for the silent installation. I will take the D2_template.xml file from my previous blog as a starting point for the silent parameter file:

[dmadmin@cs_01 ~]$ cd $DOCUMENTUM/D2-Install/
[dmadmin@cs_01 D2-Install]$ ls *.zip *.tar.gz
-rw-r-----. 1 dmadmin dmadmin 491128907 Jun 16 08:12 D2_4.7.0_P25.zip
-rw-r-----. 1 dmadmin dmadmin  61035679 Jun 16 08:12 D2_pluspack_4.7.0.P25.zip
-rw-r-----. 1 dmadmin dmadmin 122461951 Jun 16 08:12 emc-dfs-sdk-7.3.tar.gz
[dmadmin@cs_01 D2-Install]$
[dmadmin@cs_01 D2-Install]$ unzip $DOCUMENTUM/D2-Install/D2_4.7.0_P25.zip -d $DOCUMENTUM/D2-Install/
[dmadmin@cs_01 D2-Install]$ unzip $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25.zip -d $DOCUMENTUM/D2-Install/
[dmadmin@cs_01 D2-Install]$ unzip $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/C2-Dar-Install.zip -d $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/
[dmadmin@cs_01 D2-Install]$ unzip $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/D2-Bin-Dar-Install.zip -d $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/
[dmadmin@cs_01 D2-Install]$ unzip $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/O2-Dar-Install.zip -d $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/
[dmadmin@cs_01 D2-Install]$ tar -xzvf $DOCUMENTUM/D2-Install/emc-dfs-sdk-7.3.tar.gz -C $DOCUMENTUM/D2-Install/
[dmadmin@cs_01 D2-Install]$
[dmadmin@cs_01 D2-Install]$ #See the previous blog for the content of the "/tmp/dctm_install/D2_template.xml" file
[dmadmin@cs_01 D2-Install]$ export d2_install_file=$DOCUMENTUM/D2-Install/D2.xml
[dmadmin@cs_01 D2-Install]$ cp /tmp/dctm_install/D2_template.xml ${d2_install_file}
[dmadmin@cs_01 D2-Install]$
[dmadmin@cs_01 D2-Install]$ sed -i "s,###WAR_REQUIRED###,true," ${d2_install_file}
[dmadmin@cs_01 D2-Install]$ sed -i "s,###BPM_REQUIRED###,true," ${d2_install_file}
[dmadmin@cs_01 D2-Install]$ sed -i "s,###DAR_REQUIRED###,true," ${d2_install_file}
[dmadmin@cs_01 D2-Install]$
[dmadmin@cs_01 D2-Install]$ sed -i "s,###DOCUMENTUM###,$DOCUMENTUM," ${d2_install_file}
[dmadmin@cs_01 D2-Install]$
[dmadmin@cs_01 D2-Install]$ sed -i "s,###PLUGIN_LIST###,$DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/C2-Install-4.7.0.jar;$DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/D2-Bin-Install-4.7.0.jar;$DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/O2-Install-4.7.0.jar;," ${d2_install_file}
[dmadmin@cs_01 D2-Install]$
[dmadmin@cs_01 D2-Install]$ sed -i "s,###JMS_HOME###,$DOCUMENTUM_SHARED/wildfly9.0.1," ${d2_install_file}
[dmadmin@cs_01 D2-Install]$
[dmadmin@cs_01 D2-Install]$ sed -i "s,###DFS_SDK_PACKAGE###,emc-dfs-sdk-7.3," ${d2_install_file}
[dmadmin@cs_01 D2-Install]$
[dmadmin@cs_01 D2-Install]$ read -s -p "  ----> Please enter the Install Owner's password: " dm_pw; echo; echo
  ----> Please enter the Install Owner's password: <TYPE HERE THE PASSWORD>
[dmadmin@cs_01 D2-Install]$ sed -i "s,###INSTALL_OWNER###,dmadmin," ${d2_install_file}
[dmadmin@cs_01 D2-Install]$ sed -i "s,###INSTALL_OWNER_PASSWD###,${dm_pw}," ${d2_install_file}
[dmadmin@cs_01 D2-Install]$
[dmadmin@cs_01 D2-Install]$ sed -i "s/###DOCBASE_LIST###/Docbase1/" ${d2_install_file}
[dmadmin@cs_01 D2-Install]$

 

Now that the silent file is ready and that all source packages are available, we can start the D2 Installation with the command below. Please note the usage of the tracing/debugging options as well as the usage of the “-Djava.io.tmpdir” Java option to ask D2 to put all tmp files in a specific directory, with this, D2 is supposed to trace/debug everything and use my specific temporary folder:

[dmadmin@cs_01 D2-Install]$ java -DTRACE=true -DDEBUG=true -Djava.io.tmpdir=$DOCUMENTUM/D2-Install/tmp -jar $DOCUMENTUM/D2-Install/D2_4.7.0_P25/D2-Installer-4.7.0.jar ${d2_install_file}

 

The D2 Installer printed the following extract:

...
Installing plugin: $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/C2-Install-4.7.0.jar
Plugin install command: [java, -jar, $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/C2-Install-4.7.0.jar, $DOCUMENTUM/D2-Install/tmp/D2_4.7.0/scripts/C6-Plugins-Install_new.xml]
Line read: [ Starting automated installation ]
Installing plugin: $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/D2-Bin-Install-4.7.0.jar
Plugin install command: [java, -jar, $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/D2-Bin-Install-4.7.0.jar, $DOCUMENTUM/D2-Install/tmp/D2_4.7.0/scripts/C6-Plugins-Install_new.xml]
Line read: [ Starting automated installation ]
Installing plugin: $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/O2-Install-4.7.0.jar
Plugin install command: [java, -jar, $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/O2-Install-4.7.0.jar, $DOCUMENTUM/D2-Install/tmp/D2_4.7.0/scripts/C6-Plugins-Install_new.xml]
Line read: [ Starting automated installation ]
Installing plugin: $DOCUMENTUM/D2-Install/tmp/D2_4.7.0/plugin/D2-Widget-Install.jar
...
...
Current line: #################################
Current line: #           Plugins               #
Current line: #################################
Current line: #plugin_1=../C2/C2-Plugin.jar
Updating line with 'plugin_'.
Updating plugin 1 with plugin name: D2-Widget-Plugin.jar and config exclude value of: false
Updating plugin 2 with plugin name: D2-Specifications-Plugin.jar and config exclude value of: false
Current line: #plugin_2=../O2/O2-Plugin.jar
Current line: #plugin_3=../P2/P2-Plugin.jar
...

 

As you can see, there are no errors so if you aren’t paying attention, you might think that the D2+Pack is properly installed. It’s not. At the end of the extract I put above, you can see that the D2 Installer is updating the plugins list with some elements (D2-Widget-Plugin.jar & D2-Specifications-Plugin.jar). If there were no issue, the D2+Pack Plugins would have been added in this section as well, which isn’t the case.

You can check all temporary files, all log files, it will not be printed anywhere that there were an issue while installing the D2+Pack Plugins. In fact, there are 3 things missing:

  • The DARs of the D2+Pack Plugins weren’t installed
  • The libraries of the D2+Pack Plugins weren’t deployed into the JMS
  • The libraries of the D2+Pack Plugins weren’t packaged in the WAR files

There is a way to quickly check if the D2+Pack Plugins DARs have been installed, just look inside the docbase config folder, there should be one log file for the D2 Core DARs as well as one log file for each of the D2+Pack Plugins. So that’s what you should get:

[dmadmin@cs_01 D2-Install]$ cd $DOCUMENTUM/dba/config/Docbase1/
[dmadmin@cs_01 Docbase1]$ ls -ltr *.log
-rw-r-----. 1 dmadmin dmadmin  62787 Jun 16 08:18 D2_CORE_DAR.log
-rw-r-----. 1 dmadmin dmadmin   4794 Jun 16 08:20 D2-C2_dar.log
-rw-r-----. 1 dmadmin dmadmin   3105 Jun 16 08:22 D2-Bin_dar.log
-rw-r-----. 1 dmadmin dmadmin   2262 Jun 16 08:24 D2-O2_DAR.log
[dmadmin@cs_01 Docbase1]$

 

If you only have “D2_CORE_DAR.log”, then you are potentially facing this issue. You could also check the “csDir” folder that you put in the D2 silent parameter file: if this folder doesn’t contain “O2-API.jar” or “C2-API.jar” or “D2-Bin-API.jar”, then you have the issue as well. Obviously, you could also check the list of installed DARs in the repository…

So what’s the issue? Well, you remember above when I mentioned the “-Djava.io.tmpdir” Java option to specifically ask D2 to put all temporary files under a certain location? The D2 Installer, for the D2 part, is using this option without issue… But for the D2+Pack installation, there is actually a hardcoded path for the temporary files which is /tmp. Therefore, it will ignore this Java option and will try instead to execute the installation under /tmp.

This is the issue I faced a few times already and it’s the one I wanted to talk about in this blog. For security reasons, you might have to deal from time to time with specific mount options on file systems. In this case, the “noexec” option was set on the /tmp mount point and therefore D2 wasn’t able to execute commands under /tmp and instead of printing an error, it just bypassed silently the installation. I had a SR opened with the Documentum Support (when it was still EMC) to see if it was possible to use the Java option and not /tmp but it looks like it’s still not solved since I had the exact same issue with the D2 4.7 P25 which was released very recently.

Since there is apparently no way to specify which temporary folder should be used for the D2+Pack Plugins, you should either perform the installation manually (DAR installation + libraries in JMS & WAR files) or remove the “noexec” option on the file system for the time of the installation:

[dmadmin@cs_01 Docbase1]$ mount | grep " /tmp"
/dev/mapper/VolGroup00-LogVol06 on /tmp type ext4 (rw,noexec,nosuid,nodev)
[dmadmin@cs_01 Docbase1]$
[dmadmin@cs_01 Docbase1]$ sudo mount -o remount,exec /tmp
[dmadmin@cs_01 Docbase1]$ mount | grep " /tmp"
/dev/mapper/VolGroup00-LogVol06 on /tmp type ext4 (rw,nosuid,nodev)
[dmadmin@cs_01 Docbase1]$
[dmadmin@cs_01 Docbase1]$ #Execute the D2 Installer here
[dmadmin@cs_01 Docbase1]$
[dmadmin@cs_01 Docbase1]$ sudo mount -o remount /tmp
[dmadmin@cs_01 Docbase1]$ mount | grep " /tmp"
/dev/mapper/VolGroup00-LogVol06 on /tmp type ext4 (rw,noexec,nosuid,nodev)

 

With the workaround in place, the D2 Installer should now print the following (same extract as above):

...
Installing plugin: $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/C2-Install-4.7.0.jar
Plugin install command: [java, -jar, $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/C2-Install-4.7.0.jar, $DOCUMENTUM/D2-Install/tmp/D2_4.7.0/scripts/C6-Plugins-Install_new.xml]
Line read: [ Starting automated installation ]
Line read: Current MAC address : [ Starting to unpack ]
Line read: [ Processing package: core (1/2) ]
Line read: [ Processing package: DAR (2/2) ]
Line read: [ Unpacking finished ]
Line read: [ Writing the uninstaller data ... ]
Line read: [ Automated installation done ]
Installing plugin: $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/D2-Bin-Install-4.7.0.jar
Plugin install command: [java, -jar, $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/D2-Bin-Install-4.7.0.jar, $DOCUMENTUM/D2-Install/tmp/D2_4.7.0/scripts/C6-Plugins-Install_new.xml]
Line read: [ Starting automated installation ]
Line read: Current MAC address : [ Starting to unpack ]
Line read: [ Processing package: core (1/2) ]
Line read: [ Processing package: DAR (2/2) ]
Line read: [ Unpacking finished ]
Line read: [ Writing the uninstaller data ... ]
Line read: [ Automated installation done ]
Installing plugin: $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/O2-Install-4.7.0.jar
Plugin install command: [java, -jar, $DOCUMENTUM/D2-Install/D2_pluspack_4.7.0.P25/Plugins/O2-Install-4.7.0.jar, $DOCUMENTUM/D2-Install/tmp/D2_4.7.0/scripts/C6-Plugins-Install_new.xml]
Line read: [ Starting automated installation ]
Line read: Current MAC address : [ Starting to unpack ]
Line read: [ Processing package: core (1/2) ]
Line read: [ Processing package: DAR (2/2) ]
Line read: [ Unpacking finished ]
Line read: [ Writing the uninstaller data ... ]
Line read: [ Automated installation done ]
Installing plugin: $DOCUMENTUM/D2-Install/tmp/D2_4.7.0/plugin/D2-Widget-Install.jar
...
...
Current line: #################################
Current line: #           Plugins               #
Current line: #################################
Current line: #plugin_1=../C2/C2-Plugin.jar
Updating line with 'plugin_'.
Updating plugin 1 with plugin name: D2-Widget-Plugin.jar and config exclude value of: false
Updating plugin 2 with plugin name: C2-Plugin.jar and config exclude value of: false
Updating plugin 3 with plugin name: O2-Plugin.jar and config exclude value of: false
Updating plugin 4 with plugin name: D2-Specifications-Plugin.jar and config exclude value of: false
Updating plugin 5 with plugin name: D2-Bin-Plugin.jar and config exclude value of: false
Current line: #plugin_2=../O2/O2-Plugin.jar
Current line: #plugin_3=../P2/P2-Plugin.jar
...

 

As you can see above, the output is quite different: it means that the D2+Pack Plugins have been installed.

 

Cet article Documentum – D2+Pack Plugins not installed correctly est apparu en premier sur Blog dbi services.

ORA-01000 and agent13c

$
0
0

Recently I received errors messages from OEM13c saying too many cursors were opened in a database:

instance_throughput:ORA-01000: maximum open cursors exceeded

My database had currently this open_cursors value:

SQL> show parameter open_cursors

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
open_cursors                         integer     300

I decided to increase its value to 800:

SQL> alter system set open_cursors=800;

System altered.

But a few minutes later I received again the same message. I decided to have a more precise look to discover what’s was happening.

SQL> SELECT  max(a.value) as highest_open_cur, p.value as max_open_cur FROM v$sesstat a, v$statname b, v$parameter p WHERE  a.statistic# = b.statistic#  and b.name = 'opened cursors current' and p.name= 'open_cursors' group by p.value;

HIGHEST_OPEN_CUR     MAX_OPEN_CUR
   300                 800

So I need to  find out which session is causing the error:

SQL> select a.value, s.username, s.sid, s.serial# from v$sesstat a, v$statname b, v$session s where a.statistic# = b.statistic#  and s.sid=a.sid and b.name = 'opened cursors current' and s.username is not null;

     VALUE USERNAME                              SID    SERIAL#
---------- ------------------------------ ---------- ----------
         9 SYS                                     6      36943
         1 SYS                                   108      31137
         1 SYS                                   312      15397
       300 SYS                                   417      31049
        11 SYS                                   519      18527
         7 SYS                                   619      48609
         1 SYS                                   721      51139
         0 PUBLIC                                922         37
        17 SYS                                  1024          1
        14 SYS                                  1027      25319
         1 SYS                                  1129      40925

A sys connection is using 300 cursors :=(( let’s see what it is:

Great the agent 13c is causing the problem :=((

I already encountered this kind of problem on another client’s site. In fact the agent 13c is using metrics to determine the High Availability Disk or Media Backup every 15 minutes and is using a lot of cursors. The best way is to disable those metrics to avoid ORA-01000 errors:

After reloading the agent and reevaluating the alert, the incident disappeared successfully :=)

Cet article ORA-01000 and agent13c est apparu en premier sur Blog dbi services.


SQL Server 2019 availability group R/W connection redirection, routing mesh and load balancing

$
0
0

SQL Server 2019 availability group feature will provide secondary to primary replica read/write connection redirection. I wrote about it in a previous blog post here. It consists in redirecting client application connections to the primary replica regardless of the target server specified in the connections string. That’s pretty interesting in some scenarios as read scale-out or specific multi-subnet configurations where creating the traditional AG listener is not an viable option.

The new R/W connection redirection capability does the job but the one-million-dollar question here is what’s happen if one of the replicas specified in my connection string becomes suddenly unavailable? Referring  to the BOL  the connection will fail regardless the role that the replica on the target server plays but we can mitigate the issue by introducing the failover partner parameter in the connection string. As a reminder, the Failover Partner keyword in the connection string works in a database mirror setup and prevent prolonged application downtime. But from my point of view, we could go likely another way and get benefit to all the power of this new availability group feature by introducing a load balancer on the top of this topology as we could do with Docker Swarm or K8s architectures. Indeed, if we take a look more closely, this new provided mechanism by SQL Server 2019 is pretty similar to the routing mesh capabilities of container orchestrators with the same advantages and weaknesses as well. I wrote a blog post about Docker Swarm architectures where we need to implement a proxy to load balance the traffic to avoid getting stuck with the routing mesh capability where a node get unhealthy.

I just applied the same kind of configuration by using an HA Proxy (but you can use your own obviously) with my availability group topology and the behavior was basically the same. Here the intended behavior:

 

Here the configuration of my HAProxy including my 3 AG replicas (WIN20191, WIN20192, WIN20193) and a round robin algorithm at the bottom:

…
backend rserve_backend
    mode tcp
    option tcplog
    option log-health-checks
    option redispatch
    log global
    balance roundrobin
    timeout connect 10s
    timeout server 1m
    server WIN20191 192.168.40.205:1433 check
    server WIN20192 192.168.40.206:1433 check
    server WIN20193 192.168.40.207:1433 check

 

Let’s do a try with connections directly to my HAProxy that is listen on port 81 in my test scenario. Note that for this first test I will connect to the master database to force the local connection getting stick to each replica rather than going through the R/W redirection process. The goal is to check if the round-robin algorithm come into play …

$connectionString = "Server=192.168.40.14,81;uid=sa; pwd=xxxx;Integrated Security=False;Initial Catalog=master;pooling=false”

$connection = New-Object System.Data.SqlClient.SqlConnection
$connection.ConnectionString = $connectionString
$connection.Open()

$sqlCommandText="SELECT 'Current server : ' + @@SERVERNAME AS server_name"
$sqlCommand = New-Object system.Data.sqlclient.SqlCommand($sqlCommandText,$connection)
$sqlCommand.ExecuteScalar()

$connection.Close()
$connection.Dispose()

 

… and that’s the case as show below:

Test connexion initial server nb : 0 - 192.168.40.14,81 - Current server : WIN20191
Test connexion initial server nb : 1 - 192.168.40.14,81 - Current server : WIN20192
Test connexion initial server nb : 2 - 192.168.40.14,81 - Current server : WIN20193
Test connexion initial server nb : 3 - 192.168.40.14,81 - Current server : WIN20191
Test connexion initial server nb : 4 - 192.168.40.14,81 - Current server : WIN20192
Test connexion initial server nb : 5 - 192.168.40.14,81 - Current server : WIN20193

 

Let’s do a try by forcing the R/W redirection now. This time I set up the correct target database name named dummy2 for my availability group AG2019.

$connectionString = "Server=192.168.40.14,81;uid=sa; pwd=xxxx;Integrated Security=False;Initial Catalog=dummy2;pooling=false”

$connection = New-Object System.Data.SqlClient.SqlConnection
$connection.ConnectionString = $connectionString
$connection.Open()

$sqlCommandText="SELECT 'Current server : ' + @@SERVERNAME AS server_name"
$sqlCommand = New-Object system.Data.sqlclient.SqlCommand($sqlCommandText,$connection)
$sqlCommand.ExecuteScalar()

$connection.Close()
$connection.Dispose()

Test connexion initial server nb : 0 - 192.168.40.14,81 - Current server : WIN20191
Test connexion initial server nb : 1 - 192.168.40.14,81 - Current server : WIN20191
Test connexion initial server nb : 2 - 192.168.40.14,81 - Current server : WIN20191
Test connexion initial server nb : 3 - 192.168.40.14,81 - Current server : WIN20191
Test connexion initial server nb : 4 - 192.168.40.14,81 - Current server : WIN20191
Test connexion initial server nb : 5 - 192.168.40.14,81 - Current server : WIN20191

 

This time the R/W redirection is taking effect and each established connection is redirected to my primary replica – WIN20191 this time.

Finally, let’s simulate an outage of one of my replicas, let’s say the WIN20193 replica with a turn off operation and let’s see what’s happen below:

Test connexion initial server nb : 32 - 192.168.40.14,81 - Current server : WIN20191
Test connexion initial server nb : 33 - 192.168.40.14,81 - Current server : WIN20191
Test connexion initial server nb : 34 - 192.168.40.14,81 - Current server : WIN20191
Test connexion initial server nb : 35 - 192.168.40.14,81 - Current server : WIN20191
Test connexion initial server nb : 36 - 192.168.40.14,81 - Current server : WIN20191
Test connexion initial server nb : 37 - 192.168.40.14,81 - Current server : WIN20191
Test connexion initial server nb : 38 - 192.168.40.14,81 - Current server : WIN20191
Test connexion initial server nb : 39 - 192.168.40.14,81 - Current server : WIN20191
Test connexion initial server nb : 40 - 192.168.40.14,81 - Current server : WIN20191

 

Well, from a connection perspective nothing has changed and the HAProxy continues to load balance connections between the remaining healthy replicas. The R/W connection redirection mechanism still continue to come into play as well. A quick look at the HAProxy indicates the WIN20193 replica got unhealthy and the HAProxy has evicted this replica from the game.

[WARNING] 203/063813 (1772) : Health check for server rserve_backend/WIN20193 failed, reason: Layer4 timeout, check durati               on: 2001ms, status: 2/3 UP.
[WARNING] 203/063818 (1772) : Health check for server rserve_backend/WIN20193 failed, reason: Layer4 timeout, check durati               on: 2002ms, status: 1/3 UP.
[WARNING] 203/063822 (1772) : Health check for server rserve_backend/WIN20193 failed, reason: Layer4 timeout, check durati               on: 2001ms, status: 0/2 DOWN.
[WARNING] 203/063822 (1772) : Server rserve_backend/WIN20193 is DOWN. 2 active and 0 backup servers left. 2 sessions activ               e, 0 requeued, 0 remaining in queue.
[WARNING] 203/063848 (1772) : Health check for server rserve_backend/WIN20193 failed, reason: Layer4 connection problem, i               nfo: "No route to host", check duration: 4ms, status: 0/2 DOWN.

 

The new R/W redirection capability provided by Microsoft will extend possible scenarios with availability groups for sure. With previous versions of SQL Server, using a load balancer was limited to R/O workloads but SQL Server 2019 will probably change the game on this topic. Let’s see what’s happen in the future!

 

 

 

 

 

Cet article SQL Server 2019 availability group R/W connection redirection, routing mesh and load balancing est apparu en premier sur Blog dbi services.

Kubernetes – How to install a single master cluster with kubeadm?

$
0
0

To bootstrap a minimal K8S cluster for testing purpose, kubeadm is a very interesting tool. The goal of this blog is not to explain to you the K8S architecture or the components but only the deployment part. using kubeadm.

The final architecture of our cluster will be the following:

Source:https://elastisys.com/2018/01/25/setting-highly-available-kubernetes-clusters/

Docker Installation

The first step is to install Docker on each cluster node:

yum install yum-utils device-mapper-persistent-data lvm2
yum-config-manager \
--add-repo \
https://download.docker.com/linux/centos/docker-ce.repo

yum update && yum install docker-ce-18.06.2.ce
mkdir /etc/docker

cat > /etc/docker/daemon.json <<EOF
{
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2",
"storage-opts": [
"overlay2.override_kernel_check=true"
]
}
EOF
mkdir -p /etc/systemd/system/docker.service.d
systemctl daemon-reload
systemctl restart docker

 

Kubeadm Installation

Before installing Kubernetes components some pre-requisites need to be configured within your cluster nodes.

Disable SWAP devices
docker@docker-manager000 ~]$ cat /proc/swaps
Filename				Type		Size	Used	Priority
/dev/dm-1                               partition	5242876	0	-1

[docker@docker-manager000 ~]$ sudo swapoff -a

Remove any matching reference found in /etc/fstab & Reboot the system for changes taking effect.

Set SELinux in permissive mode (effectively disabling it)
[docker@docker-manager000 ~]$ sudo setenforce 0
[docker@docker-manager000 ~]$ sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
docker@docker-manager000 ~]$ sudo sed -i 's/^SELINUX=permissive$/SELINUX=disabled/' /etc/selinux/config
Hostname configuration
[docker@docker-manager000 ~]$ hostnamectl set-hostname docker-manager000
[docker@docker-manager000 ~]$ hostname
docker-manager000
Network connectivity

Modify the /etc/hosts of each server as following:

[docker@docker-manager000 ~]$ cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost6 localhost6.localdomain6 localhost6.localdomain

xx.xx.x.xx	docker-manager000
xx.xx.x.xx	docker-worker000
xx.xx.x.xx	docker-worker001

 

Installing kubeadm, kubelet and kubectl

First, configure the Kubernetes repository for downloading the latest stable version of Kubernetes:

cat < /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
EOF

Then install packages:

[docker@docker-manager000 ~]$ sudo yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
[docker@docker-manager000 ~]$ sudo systemctl enable --now kubelet

 

Install NTP service
[root@docker-manager000 containerd]$ yum install ntp
[root@docker-manager000 containerd]$ systemctl start ntpd
[root@docker-manager000 containerd]$ systemctl enable ntpd
Ensure net.bridge.bridge-nf-call-iptables is set to 1 in your sysctl config for proper traffic routing
[root@docker-manager000 ~]$ cat <  /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF

Manually settings:
[docker@docker-manager000 ~]$ sudo sysctl net.bridge.bridge-nf-call-iptables=1
[docker@docker-manager000 ~]$ sudo sysctl net.bridge.bridge-nf-call-ip6tables=1

[root@docker-manager000 ~]$ sysctl --system
Make sure that the br_netfilter module is loaded
[root@docker-manager000 ~]$ lsmod | grep br_netfilter (to check)
[root@docker-manager000 ~]$ modprobe br_netfilter (to load)

 

Initialize your cluster using Kubeadm

The kubadm command will install you all control plane components (api server + etcd + scheduler + controller-manager)

[docker@docker-manager000 ~]$ sudo kubeadm init --pod-network-cidr=192.168.0.0/16

Copy-paste the join command with the token and keep it for the next steps.

Install a pod network add-on: Calico
kubectl apply -f https://docs.projectcalico.org/v3.8/manifests/calico.yaml

Import the kube config in your home before start using your cluster.

[docker@docker-manager000 ~]$ mkdir -p $HOME/.kube
[docker@docker-manager000 ~]$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
[docker@docker-manager000 ~]$ sudo chown $(id -u):$(id -g) $HOME/.kube/config
Join your cluster and run the following on Workers side

From each worker nodes, you can now execute the join command as following:

sudo kubeadm join xx.xx.x.xx:6443 --token vv2iva.yt241t7epiu7i4iz \
    --discovery-token-ca-cert-hash sha256:558ec4a25cd39185fd753aba81748569f033af9c15e48dcdf3d10206eecdf177

On the master, check the cluster status:

[docker@docker-manager000 ~]$ kubectl get nodes -o wide
NAME                STATUS   ROLES    AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION               CONTAINER-RUNTIME
docker-manager000   Ready    master   20d   v1.14.2   xx.xx.x.xx            CentOS Linux 7 (Core)   3.10.0-957.12.2.el7.x86_64   docker://18.9.6
docker-worker000    Ready       20d   v1.14.1   xx.xx.x.xx            CentOS Linux 7 (Core)   3.10.0-957.10.1.el7.x86_64   docker://18.9.5
docker-worker001    Ready       20d   v1.14.1   xx.xx.x.xx            CentOS Linux 7 (Core)   3.10.0-957.10.1.el7.x86_64   docker://18.9.5

List all pods within the kube-system namespace:

[docker@docker-manager000 ~]$ kubectl get pods -n kube-system -o wide
NAME                                        READY   STATUS    RESTARTS   AGE     IP               NODE                NOMINATED NODE   READINESS GATES
calico-kube-controllers-6894d6f4f4-jtwnn    1/1     Running   5          20d     192.168.123.39   docker-manager000              
calico-node-9586j                           1/1     Running   6          20d     xx.xx.x.xx       docker-manager000              
calico-node-lc6vh                           1/1     Running   3          20d     xx.xx.x.xx       docker-worker000               
calico-node-ph9j7                           1/1     Running   3          20d     xx.xx.x.xx       docker-worker001               
coredns-fb8b8dccf-cxfcq                     1/1     Running   4          11d     192.168.123.41   docker-manager000              
coredns-fb8b8dccf-pf879                     1/1     Running   5          11d     192.168.27.145   docker-worker001               
etcd-docker-manager000                      1/1     Running   6          20d     xx.xx.x.xx       docker-manager000              
kube-apiserver-docker-manager000            1/1     Running   2          11d     xx.xx.x.xx       docker-manager000              
kube-controller-manager-docker-manager000   1/1     Running   2          11d     xx.xx.x.xx       docker-manager000              
kube-proxy-j8v2b                            1/1     Running   2          11d     xx.xx.x.xx       docker-worker000               
kube-proxy-jp5dr                            1/1     Running   2          11d     xx.xx.x.xx       docker-worker001               
kube-proxy-z57xx                            1/1     Running   2          11d     xx.xx.x.xx       docker-manager000              
kube-scheduler-docker-manager000            1/1     Running   2          11d     xx.xx.x.xx       docker-manager000              

Your cluster is now up and running! You can start your first deployment 🙂

Cet article Kubernetes – How to install a single master cluster with kubeadm? est apparu en premier sur Blog dbi services.

Deploying SQL Server 2019 container on RHEL 8 with podman

$
0
0

Having a fresh install of RHEL8 on my lab environment, I was curious to take a look at new containerization stuff from Red Hat in the context of SQL Server 2019. Good chances are the future version of SQL Server should be available and supported on with the latest version of Red Hat but for now this blog post is purely experimental. This time I wanted to share with you some thoughts about the new Podman command.

First of all, we should be aware that since RHEL8 Red Hat decided to replace docker with CRI-O/podman in order to provide a “daemonless” container world and especially for Kubernetes. By 2016, Kubernetes project introduced the Container Runtime Interface (CRI).  Basically, with CRI, Kubernetes can be container runtime-agnostic. CRI-O that is an open source project initiated by Red Hat the same year that gives the ability to run containers directly from Kubernetes without any unnecessary code or tooling as long as the container remains OCI-compliant. Because Docker is not implemented anymore (and officially not supported) by Red Hat since RHEL8, we need a client tool for working with containers and this is where Podman steps in. To cut the story short, Podman implements almost all the Docker CLI commands and more.

So, let’s have an overview of Podman commands through the installation of a SQL Server 2019 based container. It is worth noting that Podman is not intended to be used in the context of a “standalone” container environnement and should be used with an container orchestrator like K8s or an orchestration platform like OpenShift.  That said,  let’s first create a host directory to persist the SQL Server database files.

$ sudo mkdir -p  /var/mssql/data
$ sudo chmod 755 -R /var/mssql/data

 

Then let’s download the SQL Server 2019 RHEL image. We will use the following Podman command:

$ sudo podman pull mcr.microsoft.com/mssql/rhel/server:2019-CTP3.1
Trying to pull mcr.microsoft.com/mssql/rhel/server:2019-CTP3.1...Getting image source signatures
Copying blob 079e961eee89: 70.54 MiB / 70.54 MiB [========================] 1m3s
Copying blob 1b493d38a6d3: 1.20 KiB / 1.20 KiB [==========================] 1m3s
Copying blob 89e62e5b4261: 333.24 MiB / 333.24 MiB [======================] 1m3s
Copying blob d39017c722a8: 174.82 MiB / 174.82 MiB [======================] 1m3s
Copying config dbba412361d7: 4.98 KiB / 4.98 KiB [==========================] 0s
Writing manifest to image destination
Storing signatures
dbba412361d7ca4fa426387e1d6fc3ec85e37d630bfe70e6599b5116d392394d

 

Note that if you’re already comfortable with the Docker commands, the shift to Podman will be easy thanks to the similarity between the both tools. To get information of the new fresh image, we will use the following Podman command:

$ sudo podman images
REPOSITORY                            TAG           IMAGE ID       CREATED       SIZE
mcr.microsoft.com/mssql/rhel/server   2019-CTP3.1   dbba412361d7   3 weeks ago   1.79 GB
$ sudo podman inspect dbba
…
"GraphDriver": {
            "Name": "overlay",
            "Data": {
                "LowerDir": "/var/lib/containers/storage/overlay/b2769e971a1bdb62f1c0fd9dcc0e9fe727dca83f52812abd34173b49ae55e37d/diff:/var/lib/containers/storage/overlay/4b0cbf0d9d0ff230916734a790f47ab2adba69db44a79c8eac4c814ff4183c6d/diff:/var/lib/containers/storage/overlay/9197342671da8b555f200e47df101da5b7e38f6d9573b10bd3295ca9e5c0ae28/diff",
                "MergedDir": "/var/lib/containers/storage/overlay/b372c0d6ff718d2d182af4639870dc6e4247f684d81a8b2dc2649f8517b9fc53/merged",
                "UpperDir": "/var/lib/containers/storage/overlay/b372c0d6ff718d2d182af4639870dc6e4247f684d81a8b2dc2649f8517b9fc53/diff",
                "WorkDir": "/var/lib/containers/storage/overlay/b372c0d6ff718d2d182af4639870dc6e4247f684d81a8b2dc2649f8517b9fc53/work"
            }
        },
…

 

As show above, Podman uses the CRI-O back-end store directory with the /var/lib/containers path, instead of using the Docker default storage location (/var/lib/docker).

Go ahead and let’s take a look at the Podman info command:

$ podman info
…
OCIRuntime:
    package: runc-1.0.0-54.rc5.dev.git2abd837.module+el8+2769+577ad176.x86_64
    path: /usr/bin/runc
    version: 'runc version spec: 1.0.0'
…
store:
  ConfigFile: /home/clustadmin/.config/containers/storage.conf
  ContainerStore:
    number: 0
  GraphDriverName: overlay

 

The same kind of information is provided by the Docker info command including the runtime and the graph driver name that is overlay in my case. Generally speaking, creating and getting information of a container with Podman is pretty similar to what we may use with the usual Docker commands. Here  for instance the command to spin up a SQL Server container based on the RHEL image:

$ sudo podman run -d -e 'ACCEPT_EULA=Y' -e \
> 'MSSQL_SA_PASSWORD=Password1'  \
> --name 'sqltest' \
> -p 1460:1433 \
> -v /var/mssql/data:/var/opt/mssql/data:Z \
> mcr.microsoft.com/mssql/rhel/server:2019-CTP3.1
4f5128d36e44b1f55d23e38cbf8819041f84592008d0ebb2b24ff59065314aa4
$ sudo podman ps
CONTAINER ID  IMAGE                                            COMMAND               CREATED        STATUS            PORTS                   NAMES
4f5128d36e44  mcr.microsoft.com/mssql/rhel/server:2019-CTP3.1  /opt/mssql/bin/sq...  4 seconds ago  Up 3 seconds ago  0.0.0.0:1460->1433/tcp  sqltest

 

Here comes the interesting part. Looking at the pstree output we may notice that there is not dependencies with any (docker) daemon with CRI-O implementation. Usually with the Docker implementation we retrieve the containerd daemon and the related shim for the process within the tree. 

$ pstree
systemd─┬─NetworkManager───2*[{NetworkManager}]
        ├─…
        ├─conmon─┬─sqlservr─┬─sqlservr───138*[{sqlservr}]
        │        │          └─{sqlservr}

 

By using the runc command below, we may notice the MSSQL container (identified by the ID here) is actually running through CRI-O and runc runtime.

$ sudo runc list -q
4f5128d36e44b1f55d23e38cbf8819041f84592008d0ebb2b24ff59065314aa4

 

Let’s have a look at the existing namespace. The 9449 PID corresponds to the SQL Server process running in isolation mode through Linux namespaces.

$ sudo lsns 
…
4026532116 net         2  9449 root   /opt/mssql/bin/sqlservr
4026532187 mnt         2  9449 root   /opt/mssql/bin/sqlservr
4026532188 uts         2  9449 root   /opt/mssql/bin/sqlservr
4026532189 ipc         2  9449 root   /opt/mssql/bin/sqlservr
4026532190 pid         2  9449 root   /opt/mssql/bin/sqlservr

$ ps aux | grep sqlservr
root       9449  0.1  0.6 152072 25336 ?        Ssl  05:08   0:00 /opt/mssql/bin/sqlservr
root       9465  5.9 18.9 9012096 724648 ?      Sl   05:08   0:20 /opt/mssql/bin/sqlservr
clustad+   9712  0.0  0.0  12112  1064 pts/0    S+   05:14   0:00 grep --color=auto sqlservr

 

We can double check that the process belongs to the SQL Server container by using the nsenter command:

sudo nsenter -t 17182 --mount --uts --ipc --net --pid sh
sh-4.2# ps aux
USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root          1  0.0  0.7 152076 28044 ?        Ssl  Jul23   0:00 /opt/mssql/bin/sqlservr
root          9  2.2 19.7 9034224 754820 ?      Sl   Jul23   0:28 /opt/mssql/bin/sqlservr
root        319  0.0  0.0  13908  3400 ?        S    00:01   0:00 sh
root        326  0.0  0.1  53832  3900 ?        R+   00:02   0:00 ps aux

 

Well, we used different Podman commands to spin up a container that meets the OCI specification like Docker. For a sake of curiosity, let’s build a custom image from a Dockerfile. In fact, this is a custom image we developed for customers to meet our best practices requirements. 

$ ls -l
total 40
drwxrwxr-x. 2 clustadmin clustadmin   70 Jul 24 02:06 BestPractices
drwxrwxr-x. 2 clustadmin clustadmin   80 Jul 24 02:06 DMK
-rw-rw-r--. 1 clustadmin clustadmin  614 Jul 24 02:06 docker-compose.yml
-rw-rw-r--. 1 clustadmin clustadmin 2509 Jul 24 02:06 Dockerfile
-rw-rw-r--. 1 clustadmin clustadmin 3723 Jul 24 02:06 entrypoint.sh
-rw-rw-r--. 1 clustadmin clustadmin 1364 Jul 24 02:06 example.docker-swarm-compose.yml
-rw-rw-r--. 1 clustadmin clustadmin  504 Jul 24 02:06 healthcheck.sh
-rw-rw-r--. 1 clustadmin clustadmin   86 Jul 24 02:06 mssql.conf
-rw-rw-r--. 1 clustadmin clustadmin 4497 Jul 24 02:06 postconfig.sh
-rw-rw-r--. 1 clustadmin clustadmin 2528 Jul 24 02:06 Readme.md
drwxrwxr-x. 2 clustadmin clustadmin   92 Jul 24 02:06 scripts

 

To build an image from a Dockerfile the corresponding Podman command is as follow:

$ sudo podman build -t dbi_mssql_linux:2019-CTP3.1 .
…
--> 5db120fba51f3adc7482ec7a9fed5cc4194f13e97b855d9439a1386096797c39
STEP 65: FROM 5db120fba51f3adc7482ec7a9fed5cc4194f13e97b855d9439a1386096797c39
STEP 66: EXPOSE ${MSSQL_TCP_PORT}
--> 8b5e8234af47adb26f80d64abe46715637bd48290b4a6d7711ddf55c393cd5a8
STEP 67: FROM 8b5e8234af47adb26f80d64abe46715637bd48290b4a6d7711ddf55c393cd5a8
STEP 68: ENTRYPOINT ["/usr/local/bin/entrypoint.sh"]
--> 11045806b8af7cf2f67e5a279692e6c9e25212105bcd104ed17b235cdaea97fe
STEP 69: FROM 11045806b8af7cf2f67e5a279692e6c9e25212105bcd104ed17b235cdaea97fe
STEP 70: CMD ["tail -f /dev/null"]
--> bcb8c26d503010eb3e5d72da4b8065aa76aff5d35fac4d7958324ac3d97d5489
STEP 71: FROM bcb8c26d503010eb3e5d72da4b8065aa76aff5d35fac4d7958324ac3d97d5489
STEP 72: HEALTHCHECK --interval=15s CMD [ "/usr/local/bin/healthcheck.sh" ]
--> e7eedf0576f73c95b19adf51c49459b00449da497cf7ae417e597dd39a9e4c8f
STEP 73: COMMIT dbi_mssql_linux:2019-CTP3.1

 

The image built is now available in the local repository:

$ sudo podman images
REPOSITORY                            TAG           IMAGE ID       CREATED         SIZE
localhost/dbi_mssql_linux             2019-CTP3.1   e7eedf0576f7   2 minutes ago   1.79 GB
mcr.microsoft.com/mssql/rhel/server   2019-CTP3.1   dbba412361d7   3 weeks ago     1.79 GB

 

The next step will consist in spinning up a SQL Server container based on this new image. Note that I used a custom parameter DMK=Y to drive the creation of the DMK maintenance tool in our case which including the deployment of a custom dbi_tools database ans related objects that carry out the database maintenance.

$ sudo podman run -d -e 'ACCEPT_EULA=Y' \
> -e 'MSSQL_SA_PASSWORD=Password1' -e 'DMK=Y'  \
> --name 'sqltest2' \
> -p 1470:1433 \
> localhost/dbi_mssql_linux:2019-CTP3.1
d057e0ca41f08a948de4206e9aa07b53450c2830590f2429e50458681d230f6b

 

Let’s check if the dbi_tools has been created during the container runtime phase:

$ sudo podman exec -ti d057 /opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P Password1 -Q"SELECT name from sys.databases"
name
--------------------------------------------------------------------------------------------------------------------------------
master
tempdb
model
msdb
dbi_tools

 

Finally, to make the transition with a future blog post, the Podman tool comes with extra commands (under development) that is not available with Docker CLI. The following example generates a YAML deployment file and the corresponding service from an existing container. Please note however that containers with volumes are not supported yet.

The container definition is a follows:

$ sudo podman run -d -e 'ACCEPT_EULA=Y' -e \
'MSSQL_SA_PASSWORD=Password1'  \
--name 'sqltestwithnovolumes' \
-p 1480:1433 \
mcr.microsoft.com/mssql/rhel/server:2019-CTP3.1
7e99581eaec4c91d7c13af4525bfb3805d5b56e675fdb53d0061c231294cd442

 

And we get the corresponding YAML file generated by the Podman command:

$ sudo podman generate kube -s 7e99
# Generation of Kubernetes YAML is still under development!
#
# Save the output of this file and use kubectl create -f to import
# it into Kubernetes.
#
# Created with podman-1.0.2-dev
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: 2019-07-24T03:52:18Z
  labels:
    app: sqltestwithnovolumes
  name: sqltestwithnovolumes
spec:
  containers:
  - command:
    - /opt/mssql/bin/sqlservr
    env:
    - name: PATH
      value: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
    - name: TERM
      value: xterm
    - name: HOSTNAME
    - name: container
      value: oci
    - name: ACCEPT_EULA
      value: "Y"
    - name: MSSQL_SA_PASSWORD
      value: Password1
    image: mcr.microsoft.com/mssql/rhel/server:2019-CTP3.1
    name: sqltestwithnovolumes
    ports:
    - containerPort: 1433
      hostPort: 1480
      protocol: TCP
    resources: {}
    securityContext:
      allowPrivilegeEscalation: true
      capabilities: {}
      privileged: false
      readOnlyRootFilesystem: false
    workingDir: /
status: {}
---
apiVersion: v1
kind: Service
metadata:
  creationTimestamp: 2019-07-24T03:52:18Z
  labels:
    app: sqltestwithnovolumes
  name: sqltestwithnovolumes
spec:
  ports:
  - name: "1433"
    nodePort: 30309
    port: 1433
    protocol: TCP
    targetPort: 0
  selector:
    app: sqltestwithnovolumes
  type: NodePort
status:
  loadBalancer: {}

 

By default the service type NodePort has been created by the command. This latest command needs further testing for sure!

See you

Cet article Deploying SQL Server 2019 container on RHEL 8 with podman est apparu en premier sur Blog dbi services.

Alfresco – ActiveMQ basic setup

$
0
0

Apache ActiveMQ

ActiveMQ is an open source Java Messaging Server (JMS) from the Apache Software Foundation that supports a lot of protocols. In Alfresco 5, ActiveMQ has been introduced as a new, optional, component in the stack. It was, at the beginning, only used for “side” features like Alfresco Analytics or Alfresco Media Management in the early Alfresco 5.0. In Alfresco 6.0, ActiveMQ was still used for Alfresco Media Management but also for the Alfresco Sync Service. It’s only starting with the Alfresco 6.1, released last February, that ActiveMQ became a required component, used for the same things but also now used for transformations.

The Alfresco documentation doesn’t really describe how to install ActiveMQ or how to configure it, it just explains how to connect Alfresco to it. Therefore, I thought I would write a small blog about how to do a basic installation of ActiveMQ for a usage in Alfresco.

Alfresco 6.1 supports ActiveMQ v5.15.6 so that’s the one I will be using for this blog as example.

First let’s start with defining some environment variables that will be used to know where to put ActiveMQ binaries and data:

[alfresco@mq_n1 ~]$ echo "export ACTIVEMQ_HOME=/opt/activemq" >> ~/.profile
[alfresco@mq_n1 ~]$ echo "export ACTIVEMQ_DATA=\$ACTIVEMQ_HOME/data" >> ~/.profile
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ grep "ACTIVEMQ" ~/.profile
export ACTIVEMQ_HOME=/opt/activemq
export ACTIVEMQ_DATA=$ACTIVEMQ_HOME/data
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ source ~/.profile
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ echo $ACTIVEMQ_DATA
/opt/activemq/data
[alfresco@mq_n1 ~]$

 

I’m usually using symlinks for all the components so that I can keep a generic path in case of upgrades, aso… So, let’s download the software and put all that where it should:

[alfresco@mq_n1 ~]$ activemq_version="5.15.6"
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ wget http://archive.apache.org/dist/activemq/${activemq_version}/apache-activemq-${activemq_version}-bin.tar.gz
--2019-07-25 16:55:23--  http://archive.apache.org/dist/activemq/5.15.6/apache-activemq-5.15.6-bin.tar.gz
Resolving archive.apache.org... 163.172.17.199
Connecting to archive.apache.org|163.172.17.199|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 58556801 (56M) [application/x-gzip]
Saving to: ‘apache-activemq-5.15.6-bin.tar.gz’

apache-activemq-5.15.6-bin.tar.gz     100%[=======================================================================>]  55.84M  1.62MB/s    in 35s

2019-07-25 16:55:58 (1.60 MB/s) - ‘apache-activemq-5.15.6-bin.tar.gz’ saved [58556801/58556801]

[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ tar -xzf apache-activemq-${activemq_version}-bin.tar.gz
[alfresco@mq_n1 ~]$ mkdir -p $ACTIVEMQ_HOME-${activemq_version}
[alfresco@mq_n1 ~]$ ln -s $ACTIVEMQ_HOME-${activemq_version} $ACTIVEMQ_HOME
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ ls -l $ACTIVEMQ_HOME/.. | grep -i activemq
lrwxr-xr-x   1 alfresco  alfresco        31 Jul 25 17:04 activemq -> /opt/activemq-5.15.6
drwxr-xr-x   2 alfresco  alfresco        64 Jul 25 17:03 activemq-5.15.6
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ rm -rf ./apache-activemq-${activemq_version}/data
[alfresco@mq_n1 ~]$ mkdir -p $ACTIVEMQ_DATA
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ mv apache-activemq-${activemq_version}/* $ACTIVEMQ_HOME/

 

Once that is done and before starting ActiveMQ for the first time, there are still some configurations to be done. It is technically possible to add a specific authentication for communications between Alfresco and ActiveMQ or setup the communications in SSL for example. It depends on the usage you will have for the ActiveMQ but as a minimal configuration for use with Alfresco, I believe that the default users (“guest” to access docbroker & “user” to access web console) should at least be removed and the admin password changed:

[alfresco@mq_n1 ~]$ activemq_admin_pwd="Act1v3MQ_pwd"
[alfresco@mq_n1 ~]$ activemq_broker_name="`hostname -s`"
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ # Remove user "user" from the web console
[alfresco@mq_n1 ~]$ sed -i "/^user:[[:space:]]*.*/d" $ACTIVEMQ_HOME/conf/jetty-realm.properties
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ # Remove user "guest" from the broker
[alfresco@mq_n1 ~]$ sed -i "/^guest.*/d" $ACTIVEMQ_HOME/conf/credentials.properties
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ # Change admin password
[alfresco@mq_n1 ~]$ sed -i "s/^admin=.*/admin=${activemq_admin_pwd}\n/" $ACTIVEMQ_HOME/conf/users.properties
[alfresco@mq_n1 ~]$ sed -i "s/^admin.*/admin: ${activemq_admin_pwd}, admin/" $ACTIVEMQ_HOME/conf/jetty-realm.properties
[alfresco@mq_n1 ~]$ sed -i "s/^activemq.username=.*/activemq.username=admin/" $ACTIVEMQ_HOME/conf/credentials.properties
[alfresco@mq_n1 ~]$ sed -i "s/^activemq.password=.*/activemq.password=${activemq_admin_pwd}/" $ACTIVEMQ_HOME/conf/credentials.properties
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ grep -E "brokerName|storeUsage |tempUsage " $ACTIVEMQ_HOME/conf/activemq.xml
    <broker xmlns="http://activemq.apache.org/schema/core" brokerName="localhost" dataDirectory="${activemq.data}">
                <storeUsage limit="100 gb"/>
                <tempUsage limit="50 gb"/>
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ # Set broker name & allowed usage
[alfresco@mq_n1 ~]$ sed -i "s/brokerName=\"[^"]*\"/brokerName=\"${activemq_broker_name}\"/" $ACTIVEMQ_HOME/conf/activemq.xml
[alfresco@mq_n1 ~]$ sed -i 's,storeUsage limit="[^"]*",storeUsage limit="10 gb",' $ACTIVEMQ_HOME/conf/activemq.xml
[alfresco@mq_n1 ~]$ sed -i 's,tempUsage limit="[^"]*",tempUsage limit="5 gb",' $ACTIVEMQ_HOME/conf/activemq.xml
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ grep -E "brokerName|storeUsage |tempUsage " $ACTIVEMQ_HOME/conf/activemq.xml
    <broker xmlns="http://activemq.apache.org/schema/core" brokerName="mq_n1" dataDirectory="${activemq.data}">
                    <storeUsage limit="10 gb"/>
                    <tempUsage limit="5 gb"/>
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ chmod -R o-rwx $ACTIVEMQ_HOME
[alfresco@mq_n1 ~]$ chmod -R o-rwx $ACTIVEMQ_DATA

 

So above, I set a specific name for the broker, that’s mainly if you expect to see at some points several brokers, to differentiate them. I also change the default storeUsage and tempUsage, that’s mainly to show how it’s done because these two parameters define the limit that ActiveMQ will be able to use on the file system. I believe the default is way too much for ActiveMQ’s usage in Alfresco, so I always reduce these or use a percentage as value (percentLimit).

With the default configuration, ActiveMQ uses “${activemq.data}” for the data directory which is actually using the “$ACTIVEMQ_DATA” environment variable, if present (otherwise it sets it as $ACTIVEMQ_HOME/data). That’s the reason why I set this environment variable, so it is possible to define a different data folder without having to change the default configuration. This data folder will mainly contain the logs of ActiveMQ, the PID file and the KahaDB for the persistence adapter.

Finally creating a service for ActiveMQ and starting it is pretty easy as well:

[alfresco@mq_n1 ~]$ cat > activemq.service << EOF
[Unit]
Description=ActiveMQ service

[Service]
Type=forking
ExecStart=###ACTIVEMQ_HOME###/bin/activemq start
ExecStop=###ACTIVEMQ_HOME###/bin/activemq stop
Restart=always
User=alfresco
WorkingDirectory=###ACTIVEMQ_DATA###
LimitNOFILE=8192:65536

[Install]
WantedBy=multi-user.target
EOF
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ sed -i "s,###ACTIVEMQ_HOME###,${ACTIVEMQ_HOME}," activemq.service
[alfresco@mq_n1 ~]$ sed -i "s,###ACTIVEMQ_DATA###,${ACTIVEMQ_DATA}," activemq.service
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ sudo cp activemq.service /etc/systemd/system/
[alfresco@mq_n1 ~]$ rm activemq.service
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ sudo systemctl enable activemq.service
[alfresco@mq_n1 ~]$ sudo systemctl daemon-reload
[alfresco@mq_n1 ~]$
[alfresco@mq_n1 ~]$ sudo systemctl start activemq.service

 

Once ActiveMQ is setup as you want, for the registration in Alfresco, it’s very easy:

[alfresco@alf_n1 ~]$ cat $CATALINA_HOME/shared/classes/alfresco-global.properties
...
### ActiveMQ
messaging.broker.url=failover:(tcp://mq_n1.domain:61616?daemon=false&dynamicManagement=false&trace=false)?timeout=3000&randomize=false
#messaging.username=
#messaging.password=
...
[alfresco@alf_n1 ~]$

 

As mentioned at the beginning of this blog, ActiveMQ supports a lot of protocols so you can use pretty much what you want: TCP, NIO, SSL, NIO SSL, Peer (2 Peer), UDP, Multicast, HTTP, HTTPS, aso… You can find all the details for that here.

To add authentication between Alfresco and ActiveMQ, you will need to enable the properties in the alfresco-global.properties (the two that I commented above) and define the appropriate authentication in the ActiveMQ broker configuration. There is an example in the Alfresco documentation.

 

Cet article Alfresco – ActiveMQ basic setup est apparu en premier sur Blog dbi services.

dbvisit 9: Automatic Failover

$
0
0

dbvisit 9 is released since a few months. And one new feature I tested is the Automatic Failover. In this blog I suppose that dbvisit 9 is already installed and that the standby database is already created. Indeed I will not describe nor dbvisit installation neither the standby creation as it is the same as the previous versions.
For more info about dbvisit installation and/or dbvisit standby creation please see these steps in my previous blog or dbvisit documentation
The new feature Autamatic Failover needs to install an observer which main functions are:
-Provide remote monitoring of existing DDCs, and inform the DBA of problems in close to real-time
-Automatically perform a Failover of the DDC if previously-specified conditions are met.

We will describe observer installation and configuration later

We describe below the configuration we are using
dbvisit1: primary server with Oracle 19c
dbvisit2: standby server with Oracle 19c
dbvisitconsole : Host of the dbvisit console (dbvserver) and for the observer

As specified earlier, we need to install an observer. It is very easy to do this, just launch the install_dbvisit executable and follow the instructions

[oracle@dbvisitconsole installer]$ pwd
/home/oracle/dbvisit/installer
[oracle@dbvisitconsole installer]$ ./install-dbvisit

-----------------------------------------------------------
    Welcome to the Dbvisit software installer.
-----------------------------------------------------------

    It is recommended to make a backup of our current Dbvisit software
    location (Dbvisit Base location) for rollback purposes.

    Installer Directory /home/oracle/dbvisit

>>> Please specify the Dbvisit installation directory (Dbvisit Base).

    The various Dbvisit products and components - such as Dbvisit Standby,
    Dbvisit Dbvnet will be installed in the appropriate subdirectories of
    this path.

    Enter a custom value or press ENTER to accept default [/usr/dbvisit]:
     > /u01/app/dbvisit
    DBVISIT_BASE = /u01/app/dbvisit

    -----------------------------------------------------------
    Component      Installer Version   Installed Version
    -----------------------------------------------------------
    standby        9.0.02_0_gbd40c486                                not installed                 
    dbvnet         9.0.02_0_gbd40c486                                not installed                 
    dbvagent       9.0.02_0_gbd40c486                                not installed                 
    dbvserver      9.0.02_0_gbd40c486                                9.0.02_0_gbd40c486            
    observer       1.02                                              not installed                 

    -----------------------------------------------------------

    What action would you like to perform?
       1 - Install component(s)
       2 - Uninstall component(s)
       3 - Exit

    Your choice: 1

    Choose component(s):
       1 - Core Components (Dbvisit Standby Cli, Dbvnet, Dbvagent)
       2 - Dbvisit Standby Core (Command Line Interface)
       3 - Dbvnet (Dbvisit Network Communication)
       4 - Dbvagent (Dbvisit Agent)
       5 - Dbvserver (Dbvisit Central Console) - Not available on Solaris/AIX
       6 - Dbvisit Observer (Automatic Failover Option) - Not available on Solaris/AIX
    Press ENTER to exit Installer

    Your choice: 6

-----------------------------------------------------------
    Summary of the Dbvisit OBSERVER configuration
-----------------------------------------------------------
    DBVISIT_BASE /u01/app/dbvisit

    Press ENTER to continue

-----------------------------------------------------------
    About to install Dbvisit OBSERVER
-----------------------------------------------------------
    Component observer installed.

    -----------------------------------------------------------
    Component      Installer Version   Installed Version
    -----------------------------------------------------------
    standby        9.0.02_0_gbd40c486                                not installed                 
    dbvnet         9.0.02_0_gbd40c486                                not installed                 
    dbvagent       9.0.02_0_gbd40c486                                not installed                 
    dbvserver      9.0.02_0_gbd40c486                                9.0.02_0_gbd40c486            
    observer       1.02                                              1.02                          

    -----------------------------------------------------------

    What action would you like to perform?
       1 - Install component(s)
       2 - Uninstall component(s)
       3 - Exit

    Your choice: 3

>>> Installation completed
    Install log /tmp/dbvisit_install.log.201907231647.
[oracle@dbvisitconsole installer]$

And once the installation done, we can start it

[oracle@dbvisitconsole observer]$ ./observersvc start &
[1] 2866

[oracle@dbvisitconsole observer]$ ps -ef | grep obser
oracle    2866  2275  0 14:25 pts/0    00:00:01 ./observersvc start
oracle    2921  2275  0 14:29 pts/0    00:00:00 grep --color=auto obser
[oracle@dbvisitconsole observer]$

After starting the observer we have to add the observer server. This is done from the MANAGE CONFIGURATION TAB from the MENU

From the Configuration TAB, choose the NEW on the left to add a dbvisit observer

And then fill the informations. Note that the default passphrase for the observer is admin900 and then save

To monitor our configuration by the observer, let’s click on Monitor

And then specify the poll interval and the number of retries before a failover happens. In our case

The observer will monitor the configuration every 60 s and will retry 5 times if there is any error.
If after 5 minutes (5×60 secondes), the probleme is not fixed, than an automatic failover will happen.

The observer logfile is located on the observer server

[oracle@dbvisitconsole log]$ pwd
/u01/app/dbvisit/observer/log
[oracle@dbvisitconsole log]$ ls -l
total 8
-rw-r--r--. 1 oracle oinstall 1109 Jul 25 15:24 observer.log
-rw-r--r--. 1 oracle oinstall   97 Jul 25 15:24 orcl_1_observer.log
[oracle@dbvisitconsole log]$

[oracle@dbvisitconsole log]$ tail -f orcl_1_observer.log
2019/07/25 13:24:46 DDC: DDC#1(orcl): Started watchdog: Watchdog successfully started monitoring

Now let’s break the primary database and normally a failover should happen after 5 minutes

[oracle@dbvisit1 log]$ ps -ef | grep pmon
oracle    1887     1  0 14:03 ?        00:00:00 ora_pmon_orcl
oracle   18199  1733  0 15:31 pts/0    00:00:00 grep --color=auto pmon
[oracle@dbvisit1 log]$ kill -9 1887
[oracle@dbvisit1 log]$ ps -ef | grep pmon
oracle   18304  1733  0 15:32 pts/0    00:00:00 grep --color=auto pmon
[oracle@dbvisit1 log]$

In the observer logfile we can see that the standby was promoted after 5 retries.

[oracle@dbvisitconsole log]$ tail -f orcl_1_observer.log
2019/07/25 13:24:46 DDC: DDC#1(orcl): Started watchdog: Watchdog successfully started monitoring
2019/07/25 13:33:51 DDC: DDC#1(orcl): rules failing (1/5): primary: error on dbvisit1:7891: unexpected database status for DDC orcl: got: "Database is down", expected: "Regular database open in read write mode"
2019/07/25 13:34:51 DDC: DDC#1(orcl): rules failing (2/5): primary: error on dbvisit1:7891: unexpected database status for DDC orcl: got: "Database is down", expected: "Regular database open in read write mode"
2019/07/25 13:35:51 DDC: DDC#1(orcl): rules failing (3/5): primary: error on dbvisit1:7891: unexpected database status for DDC orcl: got: "Database is down", expected: "Regular database open in read write mode"
2019/07/25 13:36:51 DDC: DDC#1(orcl): rules failing (4/5): primary: error on dbvisit1:7891: unexpected database status for DDC orcl: got: "Database is down", expected: "Regular database open in read write mode"
2019/07/25 13:37:51 DDC: DDC#1(orcl): rules failing (5/5): primary: error on dbvisit1:7891: unexpected database status for DDC orcl: got: "Database is down", expected: "Regular database open in read write mode"
2019/07/25 13:37:51 DDC: DDC#1(orcl): configuration failed after 5 retries: primary: error on dbvisit1:7891: unexpected database status for DDC orcl: got: "Database is down", expected: "Regular database open in read write mode"
2019/07/25 13:37:51 DDC: DDC#1(orcl): watchdog shutting down: activation imminent
2019/07/25 13:37:51 DDC: DDC#1(orcl): ACTIVATION started: conditions for activation satisfied
2019/07/25 13:38:41 DDC: DDC#1(orcl): ACTIVATION successful: ACTIVATION OK: standby activated, activation took: 50.043794192s

And we can verify that the standby is now open in read write mode

[oracle@dbvisit2 trace]$ sqlplus / as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Thu Jul 25 15:49:38 2019
Version 19.3.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c Standard Edition 2 Release 19.0.0.0.0 - Production
Version 19.3.0.0.0

SQL> select db_unique_name,open_mode from v$database;

DB_UNIQUE_NAME                 OPEN_MODE
------------------------------ --------------------
orcl                           READ WRITE

SQL>

Note that we can use a user defined script with the observer. For more information please see dbvisit documentation

Cet article dbvisit 9: Automatic Failover est apparu en premier sur Blog dbi services.

Alfresco Clustering – Basis & Architectures

$
0
0

This blog will be the first of a series on Alfresco HA/Clustering topics. It’s been too long I haven’t posted anything related to Alfresco so I thought about writing a few blogs about my experience with setting up more or less complex HA/Clustering infrastructures. So, let’s start this first part with an introduction to the Alfresco HA/Clustering.

If you want to setup a HA/Cluster environment, you will have to first think about where you want to go exactly. Alfresco is composed of several components so “what do you want to achieve exactly?”, that would probably be the first question to ask.

Alfresco offers a lot of possibilities, you can more or less do whatever you want. That’s really great, but it also means that you should plan what you want to do first. Do you just want a simple HA architecture for Share+Repository but you can live without Solr for a few minutes/hours (in case of issues) or you absolutely want all components to be always available? Or maybe you want an HA architecture which is better suited for high throughput? Obviously, there might be some costs details that need to be taken into consideration linked to the resources but also the licenses: the Alfresco Clustering license itself but also the Index Engine license if you go for separated Solr Servers.

That’s what you need to define first to avoid losing time changing configurations and adding more components into the picture later. Alternatively (and that’s something I will try to cover as much as I can), it’s also possible to setup an environment which will allow you to add more components (at least some of them…) as needed without having to change your HA/Clustering configuration, if you are doing it right from the start and if you don’t change too much the architecture itself.

I mentioned earlier the components of Alfresco (Alfresco Content Services, not the company), these are the ones we are usually talking about:

  • *Front-end (Apache HTTPD, Nginx, …)
  • *ActiveMQ
  • Alfresco PDF Renderer
  • Database
  • File System
  • ImageMagick
  • Java
  • LibreOffice
  • *Share (Tomcat)
  • *Repository (Tomcat)
  • *Solr6 (Jetty)

 

In this series of blog, I won’t talk about the Alfresco PDF Renderer, ImageMagick & Java because these are just simple binaries/executables that need to be available from the Repository side. For LibreOffice, it’s usually Alfresco that is managing it directly (multi-processes, restart if crash, aso…). It wouldn’t really make sense to talk about these in blogs related to Clustering. I will also disregard the Database and File System ones since they are usually out of my scope. The Database is usually installed & managed by my colleagues which are DBAs, they are much better at that than me. That leaves us with all components with an asterisk (*). I will update this list with links to the different blogs.

Before jumping in the first component, which will be the subject of the next blog, I wanted to go through some possible architectures for Alfresco. There are a lot of schemas available on internet but it’s often the same architecture that is presented so I thought I would take some time to represent, in my own way, what the Alfresco’s architecture could look like.

In the below schemas, I represented the main components: Front-end, Share, Repository, Solr, Database & File System (Data) as little boxes. As mentioned previously, I won’t talk about the Database & File System so I just represented them once to see the communications with these but what is behind their boxes can be anything (with HA/Clustering or not). The arrows represent the way communications are initiated: an arrow in a single direction “A->B” means that B is never initiating a communication with A. Boxes that are glued together represent all components installed on the same host (a physical server, a VM, a container or whatever).

 

Alfresco Architecture 1N°1: This is the simplest architecture for Alfresco. As you can see, it’s not a HA/Clustering architecture but I decided to start small. I added a Front-end (even if it’s not mandatory) because it’s a best practice and I would not install Alfresco without it. Nothing specific to say on this architecture, it’s just simple.

 

Alfresco Architecture 2N°2: The first thing to do if you have the simplest architecture in place (N°1) and you start seeing some resources contention is to split the components and more specifically to install Solr separately. This should really be the minimal architecture to use, whenever possible.

 

Alfresco Architecture 3N°3: This is the first HA/Clustering architecture. It starts small as you can see with just two nodes for each Front-end/Share/Repository stack with a Load Balancer to dispatch the load on each side for an Active/Active solution. The dotted grey lines represent the Clustering communications. In this architecture, there is therefore a Clustering for Share and another one for the Repository layer. The Front-end doesn’t need Clustering since it just forwards the communications but the session itself is on the Tomcat (Share/Repository) side. There is only one Solr node and therefore both Repository boxes will communicate with the Solr node (through the Front-end or not). Between the Repository and Solr, there is one bidirectional arrow and another one unidirectional. That’s because both Repository boxes will initiate searches but the Solr will do tracking to index new content with only one Repository: this isn’t optimal.

 

Alfresco Architecture 4N°4: To solve this small issue with Solr tracking, we can add a second Load Balancer in between so that the Solr tracking can target any Repository node. The first bottleneck you will encounter in Alfresco is usually the Repository because a lot of things are happening in the background at that layer. Therefore, this architecture is usually the simplest HA/Clustering solution that you will want to setup.

 

Alfresco Architecture 5N°5: If you are facing some performance issues with Solr or if you want all components to be in HA, then you will have to duplicate the Solr as well. Between the two Solr nodes, I put a Clustering link, that’s in case you are using Solr Sharding. If you are using the default cores (alfresco and archive), then there is no communication between distinct Solr nodes. If you are using Solr Sharding and if you want a HA architecture, then you will have the same shards on both Solr nodes and in this case, there will be communications between the Solr nodes, it’s not really a Clustering so to speak, that’s how Solr Sharding is working but I still used the same representation.

 

Alfresco Architecture 6N°6: As mentioned previously (for the N°4), the Repository is usually the bottleneck. To reduce the load on this layer, it is possible to do several things. The first possibility is to install another Repository and dedicate it to the Solr Tracking. As you can see above, the communications aren’t bidirectional anymore but only unidirectional. Searches will come from the two Repository that are in Cluster and Solr Tracking will use the separated/dedicated Repository. This third Repository can then be set in read-only, the jobs and services can be disabled, the Clustering can be disabled as well (so it uses the same DB but it’s not part of the Clustering communications because it doesn’t have to), aso… I put this third Repository as a standalone box but obviously you can install it with one of the two Solr nodes.

 

Alfresco Architecture 7N°7: The next step can be to add another read-only Repository and put these two nodes side by side with the Solr nodes. This is to only have localhost communications for the Solr Tracking which is therefore a little bit easier to secure.

 

Alfresco Architecture 8N°8: The previous architectures (N°6 & N°7) introduced a new single point of failure so to fix this, there is only one way: add a new Load Balancer between the Solr and the Repository for the tracking. Behind the Load Balancer, there are two solutions: keep the fourth Repository which is also in read-only or use a fallback to the Repository node1/node2 in case the read-only Repository (node3) isn’t available. For that purpose, the Load Balancer should be in, respectively, Active/Active or Active/Passive. As you can see, I choose to represent the first one.

 

These were a few possible architectures. You can obviously add more nodes if you want to, to handle more load. There are many other solutions so have fun designing the best one, according to your requirements.

 

Cet article Alfresco Clustering – Basis & Architectures est apparu en premier sur Blog dbi services.

CloudBees DevOps Playground – Hands On with JenkinsX

$
0
0

Last week, we had the chance to attend to the CloudBees DevOps Playground in London. The event was a presentation and a Hands-On on Jenkins X done by one of the most popular guys from the CloudBees, Gareth Evans.

Before taking an interest in Jenkins X, we focused most of our time in the Docker and Kubernetes part. We enhance a lot our skills during the last months on the administration of Kubernetes cluster and the deployment of applications, especially the Documentum stack as well WebLogic.

Jenkins X is quite a new technology in the landscape of automatic deployment, and we face the difficulties to find a workshop /training related to Jenkins X administration and usage. So we decided to go to London for this CloudBees DevOps Hands-on done.

As working in middleware infrastructures between system engineer and applications, Jenkins X is completely making sense for us to automate the creation of Kubernetes infrastructure in terms of cluster and application deployment.

What’s Jenkins X?

Basically, Jenkins X automates the whole development process end to end for containerized applications based on Docker and Kubernetes.

Overview of Jenkins X:

  • Jenkins X provides an automated CI/CD solution for Kubernetes
  • Buildpacks to quickly create new applications
  • Uses GitOps to manage promotion between Environments
  • Creates Preview Environments on Pull Requests
  • Provides control via ChatOps and feedback on Pull Requests
  • Improves developers’ productivity
  • It is open source
  • Microservices architecture
  • Designed for extension
  • Relies on k8s CRDs

JX Topologies:

Jenkins X can work in 2 modes: Static and Serverless.

Cloud-Native approach:
Our goal will be to use Jenkins X to automate the deployment of containerized applications on Kubernetes cluster.
Jenkins X make a real collaboration between system engineer and application teams with a focus on making development teams productive through automation and DevOps best practices.

We will achieve the automation of CI/CD pipelines using Jenkins X as following:

This is how Jenkins X works (big picture) and we will see later how to install JX with the different methods on the cloud or on-premise and how to build CI/CD pipelines.

Cet article CloudBees DevOps Playground – Hands On with JenkinsX est apparu en premier sur Blog dbi services.


Alfresco Clustering – Repository

$
0
0

In a previous blog, I talked about some basis and presented some possible architectures for Alfresco. Now that this introduction has been done, let’s dig into the real blogs about how to setup a HA/Clustering Alfresco environment. In this blog in particular, I will talk about the Repository layer.

For the Repository Clustering, there are three prerequisites (and that’s all you need):

  • A valid license which include the Repository Clustering
  • A shared file system which is accessible from all Alfresco nodes in the Cluster. This is usually a NAS accessed via NFS
  • A shared database

 

Clustering the Repository part is really simple to do: you just need to put the correct properties in the alfresco-global.properties file. Of course, you could also manage it all from the Alfresco Admin Console but that’s not recommended, you should really always use the alfresco-global.properties by default. The Alfresco Repository Clustering is using Hazelcast. It was using JGroups and EHCache as well before Alfresco 4.2 but now it’s just Hazelcast. So to define an Alfresco Cluster, simply put the following configuration in the alfresco-global.properties of the Alfresco node1:

[alfresco@alf_n1 ~]$ getent hosts `hostname` | awk '{ print $1 }'
10.10.10.10
[alfresco@alf_n1 ~]$
[alfresco@alf_n1 ~]$ cat $CATALINA_HOME/shared/classes/alfresco-global.properties
...
### Content Store
dir.root=/shared_storage/alf_data
...
### DB
db.username=alfresco
db.password=My+P4ssw0rd
db.name=alfresco
db.host=db_vip
## MySQL
#db.port=3306
#db.driver=com.mysql.jdbc.Driver
#db.url=jdbc:mysql://${db.host}:${db.port}/${db.name}?useUnicode=yes&characterEncoding=UTF-8
#db.pool.validate.query=SELECT 1
## PostgreSQL
db.driver=org.postgresql.Driver
db.port=5432
db.url=jdbc:postgresql://${db.host}:${db.port}/${db.name}
db.pool.validate.query=SELECT 1
## Oracle
#db.driver=oracle.jdbc.OracleDriver
#db.port=1521
#db.url=jdbc:oracle:thin:@${db.host}:${db.port}:${db.name}
#db.pool.validate.query=SELECT 1 FROM DUAL
...
### Clustering
alfresco.cluster.enabled=true
alfresco.cluster.interface=10.10.10.10-11
alfresco.cluster.nodetype=Alfresco_node1
alfresco.hazelcast.password=Alfr3sc0_hz_Test_pwd
alfresco.hazelcast.port=5701
alfresco.hazelcast.autoinc.port=false
alfresco.hazelcast.max.no.heartbeat.seconds=15
...
[alfresco@alf_n1 ~]$

 

And for the Alfresco node2, you can use the same content:

[alfresco@alf_n2 ~]$ getent hosts `hostname` | awk '{ print $1 }'
10.10.10.11
[alfresco@alf_n2 ~]$
[alfresco@alf_n2 ~]$ cat $CATALINA_HOME/shared/classes/alfresco-global.properties
...
### Content Store
dir.root=/shared_storage/alf_data
...
### DB
db.username=alfresco
db.password=My+P4ssw0rd
db.name=alfresco
db.host=db_vip
## MySQL
#db.port=3306
#db.driver=com.mysql.jdbc.Driver
#db.url=jdbc:mysql://${db.host}:${db.port}/${db.name}?useUnicode=yes&characterEncoding=UTF-8
#db.pool.validate.query=SELECT 1
## PostgreSQL
db.driver=org.postgresql.Driver
db.port=5432
db.url=jdbc:postgresql://${db.host}:${db.port}/${db.name}
db.pool.validate.query=SELECT 1
## Oracle
#db.driver=oracle.jdbc.OracleDriver
#db.port=1521
#db.url=jdbc:oracle:thin:@${db.host}:${db.port}:${db.name}
#db.pool.validate.query=SELECT 1 FROM DUAL
...
### Clustering
alfresco.cluster.enabled=true
alfresco.cluster.interface=10.10.10.10-11
alfresco.cluster.nodetype=Alfresco_node2
alfresco.hazelcast.password=Alfr3sc0_hz_Test_pwd
alfresco.hazelcast.port=5701
alfresco.hazelcast.autoinc.port=false
alfresco.hazelcast.max.no.heartbeat.seconds=15
...
[alfresco@alf_n2 ~]$

 

Description of the Clustering parameters:

  • alfresco.cluster.enabled: Whether or not you want to enable the Repository Clustering for the local Alfresco node. The default value is false. You will want to set that to true for all Repository nodes that will be used by Share or any other client. If the Repository is only used for Solr Tracking, you can leave that to false
  • alfresco.cluster.interface: This is the network interface on which Hazelcast will listen for Clustering messages. This has to be an IP, it can’t be a hostname. To keep things simple and to have the same alfresco-global.properties on all Alfresco nodes however, it is possible to use a specific nomenclature:
    • 10.10.10.10: Hazelcast will try to bind on 10.10.10.10 only. If it’s not available, then it won’t start
    • 10.10.10.10-11: Hazelcast will try to bind on any IP within the range 10-11 so in this case 2 IPs: 10.10.10.10 or 10.10.10.11. If you have, let’s say, 4 IPs assigned to the local host and you don’t want Hazelcast to use 2 of these, then specify the ones that it can use and it will pick one from the list. This can also be used to have the same content for the alfresco-global.properties on different hosts… One server with IP 10.10.10.10 and a second one with IP 10.10.10.11
    • 10.10.10.* or 10.10.*.*: Hazelcast will try to bind on any IP in this range, this is an extended version of the XX-YY range above
  • alfresco.cluster.nodetype: A human-friendly string to represent the local Alfresco node. It doesn’t have any use for Alfresco, that’s really more for you. It is for example interesting to put a specific string for Alfresco node that won’t take part in the Clustering but that are still using the same Content Store and Database (like a Repository dedicated for the Solr Tracking, as mentioned above)
  • alfresco.hazelcast.password: The password to use for the Alfresco Repository Cluster. You need to use the same password for all members of the same Cluster. You should as well try to use a different password for each Cluster that you might have if they are in the same network (DEV/TEST/PROD for example), otherwise it will get ugly
  • alfresco.hazelcast.port: The default port that will be used for Clustering messages between the different members of the Cluster
  • alfresco.hazelcast.autoinc.port: Whether or not you want to allow Hazelcast to find another free port in case the default port (“alfresco.hazelcast.port”) is currently used. It will increment the port by 1 each time. You should really set this to false and just use the default port, to have full control over the channels that Clustering communications are using otherwise it might get messy as well
  • alfresco.hazelcast.max.no.heartbeat.seconds: The maximum time in seconds allowed between two heartbeat. If there is no heartbeat in this period of time, Alfresco will assume the remote node isn’t running/available

 

As you can see above, it’s really simple to add Clustering to an Alfresco Repository. Since you can(should?) have the same set of properties (except the nodetype string maybe), then it also really simplifies the deployment… If you are familiar with other Document Management System like Documentum for example, then you understand the complexity of some of these solutions! If you compare that to Alfresco, it’s like walking on the street versus walking on the moon where you obviously first need to go to the moon… Anyway, once it’s done, the logs of the Alfresco Repository node1 will display something like that when you start it:

2019-07-20 15:14:25,401  INFO  [cluster.core.ClusteringBootstrap] [localhost-startStop-1] Cluster started, name: MainRepository-<generated_id>
2019-07-20 15:14:25,405  INFO  [cluster.core.ClusteringBootstrap] [localhost-startStop-1] Current cluster members:
  10.10.10.10:5701 (hostname: alf_n1)

 

Wait for the Repository node1 to be fully started and once done, you can start the Repository node2, it needs to be started sequentially normally. You will see on the logs of the Repository node1 that another node joined automatically the Cluster:

2019-07-20 15:15:06,528  INFO  [cluster.core.MembershipChangeLogger] [hz._hzInstance_1_MainRepository-<generated_id>.event-3] Member joined: 10.10.10.11:5701 (hostname: alf_n2)
2019-07-20 15:15:06,529  INFO  [cluster.core.MembershipChangeLogger] [hz._hzInstance_1_MainRepository-<generated_id>.event-3] Current cluster members:
  10.10.10.10:5701 (hostname: alf_n1)
  10.10.10.11:5701 (hostname: alf_n2)

 

On the logs of the Repository node2, you can see directly at the initialization of the Hazelcast Cluster that the two nodes are available.

If you don’t want to check the logs, you can see pretty much the same thing from the Alfresco Admin Console. By accessing “http(s)://<hostname>:<port>/alfresco/s/enterprise/admin/admin-clustering“, you can see currently available cluster members (online nodes), non-available cluster members (offline nodes) as well as connected non-cluster members (nodes using the same DB & Content Store but with “alfresco.cluster.enabled=false”, for example to dedicate a Repository to Solr Tracking).

Alfresco also provides a small utility to check the health of the cluster which will basically ensure that the communication between each member is successful. This utility can be accessed at “http(s)://<hostname>:<port>/alfresco/s/enterprise/admin/admin-clustering-test“. It is useful to include a quick check using this utility in a monitoring solution for example, to ensure that the cluster is healthy.

 

Cet article Alfresco Clustering – Repository est apparu en premier sur Blog dbi services.

Java JDK12: JEP 325: Switch Expressions

$
0
0

Eclipse setup

In order to test Java JDK12 you will have to download Eclipse 4.11 at least. Then download JDK12 from Oracle’s web site. And configure Eclipse to use this JDK for the project. Also download the JDK12 support with the following repository link:

https://download.eclipse.org/eclipse/updates/4.11-P-builds

Edit the projects properties to use the “Java Compiler” compliance 12. And “Enable preview features” set to TRUE. Now you should be able to run JDK12 examples. Here a link for the complete setup:

https://marketplace.eclipse.org/content/java-12-support-eclipse-2019-03-411

Switch expression

The switch expression receives a new lifting in this version. It allows more readiness and more flexibility. We can now:

  • Get rid of the break word in certain cases
  • Return a value from the switch
  • Put several values in one “case”

With the following example you can see how to get rid of the “break” and use a “case L ->” switch label. In addition, the switch returns a value which can be used after the expression.

String alias = switch (day) {
    case MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY 	-> "Week day";
    case SATURDAY, SUNDAY				                -> "Weekend";
};
System.out.println(alias);

The previous few lines replaces the following ones:

switch (day) {
    case MONDAY:
        System.out.println("Week day");
        break;
    case TUESDAY:
        System.out.println("Week day");
        break;
    case WEDNESDAY:
        System.out.println("Week day");
        break;
    case THURSDAY:
        System.out.println("Week day");
        break;
    case FRIDAY:
        System.out.println("Week day");
        break;
    case SATURDAY:
        System.out.println("Weekend");
        break;
    case SUNDAY:
        System.out.println("Weekend");
        break;
}

Here the expression:

T result = switch (arg) {
    case L1 -> e1;
    case L2 -> e2;
    default -> e3;
};

When using a block of code you will still need a “break”:

int i = switch (day) {
     case MONDAY -> {
     System.out.println("Monday");
     break 0; //break is needed here to return a value if switch need a return
     }
     default -> 1;
}

Thanks to this new notation, we can now return values with a SWITCH expression. In addition it makes it more readable fo big switch expressions. Note that you can still use the “old” way, thus old code will still compile with the new JDK. For the moment it’s deployed in JDK12 which is a non-LTS release, but if the new expression is validated by the tests and results, it will definitely be part of future releases like the LTS versions and you will be able to use it then.

Cet article Java JDK12: JEP 325: Switch Expressions est apparu en premier sur Blog dbi services.

Red Hat 8 Server in 9 minutes into the Google Cloud

$
0
0

Do you have 9 minutes to learn how to quick start with the Google Cloud? Here is a simple copy/past procedure to successfully deploy a Red Hat 8 compute server into the Cloud.

Installation of the Google command client

Download first the client and uncompress it:

$ curl -O https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-sdk-255.0.0-linux-x86_64.tar.gz
$ tar zxvf google-cloud-sdk-255.0.0-linux-x86_64.tar.gz

Optionally configure the PATH for your comfort:
$ ./google-cloud-sdk/install.sh

That command will connect your local environment with your cloud. Just follow the provided instructions:
$ gcloud init --console-only

Now you’re connected and ready to go 🙂

Creation of your compute node Red Hat 8

First of all, here a simple way to get help from your local environment:
$ gcloud compute -h
Usage: gcloud compute [optional flags] <group | command>
group may be accelerator-types | addresses | backend-buckets |
backend-services | commitments | diagnose |
disk-types | disks | firewall-rules |
forwarding-rules | health-checks | http-health-checks |
https-health-checks | images | instance-groups |
instance-templates | instances | interconnects |
machine-types | network-endpoint-groups | networks |
operations | os-login | project-info | regions |
reservations | resource-policies | routers | routes |
security-policies | shared-vpc | snapshots |
sole-tenancy | ssl-certificates | ssl-policies |
target-http-proxies | target-https-proxies |
target-instances | target-pools | target-ssl-proxies |
target-tcp-proxies | target-vpn-gateways | tpus |
url-maps | vpn-tunnels | zones
command may be config-ssh | connect-to-serial-port | copy-files |
reset-windows-password | scp | sign-url | ssh |
start-iap-tunnel

You list he available images and search for the Red Hat 8 (or whichever) to create your compute server:
$ gcloud compute images list | grep rhel-8
rhel-8-v20190618 rhel-cloud rhel-8 READY

Then, you may want your compute server be located close to you. I’ll chose Europe and specially the Frankfurt area:

$ gcloud compute zones list | grep europe
europe-west4-a europe-west4 UP
europe-west4-b europe-west4 UP
europe-west4-c europe-west4 UP
europe-west1-b europe-west1 UP
europe-west1-d europe-west1 UP
europe-west1-c europe-west1 UP
europe-west3-c europe-west3 UP ----> germany frankfurt, zone A
europe-west3-a europe-west3 UP ----> germany frankfurt, zone B
europe-west3-b europe-west3 UP ----> germany frankfurt, zone C
europe-west2-c europe-west2 UP
europe-west2-b europe-west2 UP
europe-west2-a europe-west2 UP
europe-north1-a europe-north1 UP
europe-north1-b europe-north1 UP
europe-north1-c europe-north1 UP
europe-west6-a europe-west6 UP
europe-west6-b europe-west6 UP
europe-west6-c europe-west6 UP

Now, you’re ready to create your compute server with few CPU core and memory:
$ gcloud compute instances create openshift-master --zone europe-west3-a --custom-cpu=2 --custom-memory=2G --image-project=rhel-cloud --image-family=rhel-8
Created [https://www.googleapis.com/compute/v1/projects/vivid-env-242712/zones/europe-west3-a/instances/openshift-master].
NAME ZONE MACHINE_TYPE PREEMPTIBLE INTERNAL_IP EXTERNAL_IP STATUS
openshift-master europe-west3-a custom (2 vCPU, 2.00 GiB) 10.156.0.3 35.234.117.157 RUNNING

After few seconds, the compute server is immediately reachable:
$ ping -c 1 35.234.117.157
PING 35.234.117.157 (35.234.117.157) 56(84) bytes of data.
64 bytes from 35.234.117.157: icmp_seq=1 ttl=59 time=55.4 ms
--- 35.234.117.157 ping statistics ---
.
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 55.453/55.453/55.453/0.000 ms

Now you can configure you local ssh environment for your comfort and password less authentication:
$ gcloud compute config-ssh
You should now be able to use ssh/scp with your instances.
For example, try running:
.
$ ssh openshift-master.europe-west3-a.vivid-env-242712

You connect to the instance and verify your OS release:
$ ssh openshift-master.europe-west3-a.vivid-env-242712
The authenticity of host 'compute.6016769543430629535 (35.234.117.157)' can't be established.
ECDSA key fingerprint is SHA256:aiHQl7nOQsqLNEmZ2yBxe1Ry8zlkBA8fLe0pKJuEm0s.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'compute.6016769543430629535' (ECDSA) to the list of known hosts.
[nico@openshift-master ~]$ cat /etc/os-release
NAME="Red Hat Enterprise Linux"
VERSION="8.0 (Ootpa)"
ID="rhel"
ID_LIKE="fedora"
VERSION_ID="8.0"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Red Hat Enterprise Linux 8.0 (Ootpa)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:8.0:GA"
HOME_URL="https://www.redhat.com/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"

Cleaning

Don’t forget to destroy your resource once you finished your work, otherwise the monthly billing will teach it to you:
$ gcloud compute instances list
NAME ZONE MACHINE_TYPE PREEMPTIBLE INTERNAL_IP EXTERNAL_IP STATUS
openshift-master europe-west3-a custom (2 vCPU, 2.00 GiB) 10.156.0.3 35.234.117.157 RUNNING

$ gcloud compute instances delete openshift-master --zone europe-west3-a --quiet
Deleted [https://www.googleapis.com/compute/v1/projects/vivid-env-242712/zones/europe-west3-a/instances/openshift-master].

Conclusion

I find quit simple to deploy component into the Google cloud. Do you? Of course that was a simple infrastructure with one server. You may need to deploy more sustainable components and interconnect them for your more complex infrastructure. Are you about to start new deployment into the Cloud?

I hope this helps and please don’t hesitate to contact us or comment below should you need more detail.

Cet article Red Hat 8 Server in 9 minutes into the Google Cloud est apparu en premier sur Blog dbi services.

Alfresco Clustering – Share

$
0
0

In previous blogs, I talked about some basis and presented some possible architectures for Alfresco and I talked about the Clustering setup for the Alfresco Repository. In this one, I will work on the Alfresco Share layer. Therefore, if you are using another client like a CMIS/REST client or an ADF Application, it won’t work that way, but you might or might not need Clustering at that layer, it depends how the Application is working.

The Alfresco Share Clustering is used only for the caches, so you could technically have multiple Share nodes working with a single Repository or a Repository Cluster without the Share Clustering. For that, you could disable the caches on the Share layer because if you kept it enabled, you would have, eventually, faced issues. Alfresco introduced a Share Clustering which is used to keep the caches in sync, so you don’t have to disable it anymore. When needed, cache invalidation messages are sent from one Share node to all others, that include runtime application properties changes as well as new/existing site/user dashboards changes.

Just like for the Repository part, it’s really easy to setup the Share Clustering so there is really no reasons not to. It’s also using Hazelcast but it’s not based on properties that you need to configure in the alfresco-global.properties (because it’s a Share configuration), this one must be done in an XML file and there is no possibilities to do that in the Alfresco Admin Console, obviously.

All Share configuration/customization are put in the “$CATALINA_HOME/shared/classes/alfresco/web-extension” folder, this one is no exception. There are two possibilities for the Share Clustering communications:

  • Multicast
  • Unicast (TCP-IP in Hazelcast)

 

I. Multicast

If you do not know how many nodes will participate in your Share Cluster or if you want to be able to add more nodes in the future without having to change the previous nodes’ configuration, then you probably want to check and opt for the Multicast option. Just create a new file “$CATALINA_HOME/shared/classes/alfresco/web-extension/custom-slingshot-application-context.xml” and put this content inside it:

[alfresco@share_n1 ~]$ cat $CATALINA_HOME/shared/classes/alfresco/web-extension/custom-slingshot-application-context.xml
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:hz="http://www.hazelcast.com/schema/spring"
       xsi:schemaLocation="http://www.springframework.org/schema/beans
                           http://www.springframework.org/schema/beans/spring-beans-2.5.xsd
                           http://www.hazelcast.com/schema/spring
                           http://www.hazelcast.com/schema/spring/hazelcast-spring-2.4.xsd">

  <hz:topic id="topic" instance-ref="webframework.cluster.slingshot" name="share_hz_test"/>
  <hz:hazelcast id="webframework.cluster.slingshot">
    <hz:config>
      <hz:group name="slingshot" password="Sh4r3_hz_Test_pwd"/>
      <hz:network port="5801" port-auto-increment="false">
        <hz:join>
          <hz:multicast enabled="true" multicast-group="224.2.2.5" multicast-port="54327"/>
          <hz:tcp-ip enabled="false">
            <hz:members></hz:members>
          </hz:tcp-ip>
        </hz:join>
        <hz:interfaces enabled="false">
          <hz:interface></hz:interface>
        </hz:interfaces>
      </hz:network>
    </hz:config>
  </hz:hazelcast>

  <bean id="webframework.cluster.clusterservice" class="org.alfresco.web.site.ClusterTopicService" init-method="init">
    <property name="hazelcastInstance" ref="webframework.cluster.slingshot" />
    <property name="hazelcastTopicName">
      <value>share_hz_test</value>
    </property>
  </bean>

</beans>
[alfresco@share_n1 ~]$

 

In the above configuration, be sure to set a topic name (matching the hazelcastTopicName’s value) as well as a group password that is specific to this environment, so you don’t end-up with a single Cluster with members coming from different environments. For the Share layer, it’s less of an issue than for the Repository layer but still. Be sure also to use a network port that isn’t in use, it will be the port that Hazelcast will bind itself to in the local host. For Alfresco Clustering, we used 5701 so here it’s 5801 for example.

Not much more to say about this configuration, we just enabled the multicast with an IP and a port to be used and we disabled the tcp-ip one.

The interfaces is disabled by default but you can enable it, if you want to. If it’s disabled, Hazelcast will list all local interfaces (127.0.0.1, local_IP1, local_IP2, …) and it will choose one in this list. If you want to force Hazelcast to use a specific local network interface, then enable this section and add that here. In can use the following nomenclature (IP only!):

  • 10.10.10.10: Hazelcast will try to bind on 10.10.10.10 only. If it’s not available, then it won’t start
  • 10.10.10.10-11: Hazelcast will try to bind on any IP within the range 10-11 so in this case 2 IPs: 10.10.10.10 or 10.10.10.11. If you have, let’s say, 5 IPs assigned to the local host and you don’t want Hazelcast to use 3 of these, then specify the ones that it can use and it will pick one from the list. This can also be used to have the same content for the custom-slingshot-application-context.xml on different hosts… One server with IP 10.10.10.10 and a second one with IP 10.10.10.11
  • 10.10.10.* or 10.10.*.*: Hazelcast will try to bind on any IP in this range, this is an extended version of the XX-YY range above

 

For most cases, keeping the interfaces disabled is sufficient since it will just pick one available. You might think that Hazelcast may bind itself to 127.0.0.1, technically it’s possible since it’s a local network interface but I have never seen it do so, so I assume that there is some kind of preferred order if another IP is available.

Membership in Hazelcast is based on “age”, meaning that the oldest member will be the one to lead. There is no predefined Master or Slave members, they are all equal, but the oldest/first member is the one that will check if new members are allowed to join (correct config) and if so, it will send the information to all other members that joined already so they are all aligned. If multicast is enabled, a multicast listener is started to listen for new membership requests.

 

II. Unicast

If you already know how many nodes will participate in your Share Cluster or if you prefer to avoid Multicast messages (there is no real need to overload your network with such things…), then it’s preferable to use Unicast messaging. For that purpose, just create the same file as above (“$CATALINA_HOME/shared/classes/alfresco/web-extension/custom-slingshot-application-context.xml“) but instead, use the tcp-ip section:

[alfresco@share_n1 ~]$ cat $CATALINA_HOME/shared/classes/alfresco/web-extension/custom-slingshot-application-context.xml
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:hz="http://www.hazelcast.com/schema/spring"
       xsi:schemaLocation="http://www.springframework.org/schema/beans
                           http://www.springframework.org/schema/beans/spring-beans-2.5.xsd
                           http://www.hazelcast.com/schema/spring
                           http://www.hazelcast.com/schema/spring/hazelcast-spring-2.4.xsd">

  <hz:topic id="topic" instance-ref="webframework.cluster.slingshot" name="share_hz_test"/>
  <hz:hazelcast id="webframework.cluster.slingshot">
    <hz:config>
      <hz:group name="slingshot" password="Sh4r3_hz_Test_pwd"/>
      <hz:network port="5801" port-auto-increment="false">
        <hz:join>
          <hz:multicast enabled="false" multicast-group="224.2.2.5" multicast-port="54327"/>
          <hz:tcp-ip enabled="true">
            <hz:members>share_n1.domain,share_n2.domain</hz:members>
          </hz:tcp-ip>
        </hz:join>
        <hz:interfaces enabled="false">
          <hz:interface></hz:interface>
        </hz:interfaces>
      </hz:network>
    </hz:config>
  </hz:hazelcast>

  <bean id="webframework.cluster.clusterservice" class="org.alfresco.web.site.ClusterTopicService" init-method="init">
    <property name="hazelcastInstance" ref="webframework.cluster.slingshot" />
    <property name="hazelcastTopicName">
      <value>share_hz_test</value>
    </property>
  </bean>

</beans>
[alfresco@share_n1 ~]$

 

The description is basically the same as for the Multicast part. The main difference is that the multicast was disabled, the tcp-ip was enabled and there is therefore a list of members that needs to be set. This is a comma separated list of hostname or IPs that the Hazelcast will try to contact when it starts. Membership in case of Unicast is managed in the same way except that the oldest/first member will listen for new membership requests on the TCP-IP. Therefore, it’s the same principle, it’s just done differently.

Starting the first Share node in the Cluster will display the following information on the logs:

Jul 28, 2019 11:45:35 AM com.hazelcast.impl.AddressPicker
INFO: Resolving domain name 'share_n1.domain' to address(es): [127.0.0.1, 10.10.10.10]
Jul 28, 2019 11:45:35 AM com.hazelcast.impl.AddressPicker
INFO: Resolving domain name 'share_n2.domain' to address(es): [10.10.10.11]
Jul 28, 2019 11:45:35 AM com.hazelcast.impl.AddressPicker
INFO: Interfaces is disabled, trying to pick one address from TCP-IP config addresses: [share_n1.domain/10.10.10.10, share_n2.domain/10.10.10.11, share_n1.domain/127.0.0.1]
Jul 28, 2019 11:45:35 AM com.hazelcast.impl.AddressPicker
INFO: Prefer IPv4 stack is true.
Jul 28, 2019 11:45:35 AM com.hazelcast.impl.AddressPicker
INFO: Picked Address[share_n1.domain]:5801, using socket ServerSocket[addr=/0:0:0:0:0:0:0:0,localport=5801], bind any local is true
Jul 28, 2019 11:45:36 AM com.hazelcast.system
INFO: [share_n1.domain]:5801 [slingshot] Hazelcast Community Edition 2.4 (20121017) starting at Address[share_n1.domain]:5801
Jul 28, 2019 11:45:36 AM com.hazelcast.system
INFO: [share_n1.domain]:5801 [slingshot] Copyright (C) 2008-2012 Hazelcast.com
Jul 28, 2019 11:45:36 AM com.hazelcast.impl.LifecycleServiceImpl
INFO: [share_n1.domain]:5801 [slingshot] Address[share_n1.domain]:5801 is STARTING
Jul 28, 2019 11:45:36 AM com.hazelcast.impl.TcpIpJoiner
INFO: [share_n1.domain]:5801 [slingshot] Connecting to possible member: Address[share_n2.domain]:5801
Jul 28, 2019 11:45:36 AM com.hazelcast.nio.SocketConnector
INFO: [share_n1.domain]:5801 [slingshot] Could not connect to: share_n2.domain/10.10.10.11:5801. Reason: ConnectException[Connection refused]
Jul 28, 2019 11:45:37 AM com.hazelcast.nio.SocketConnector
INFO: [share_n1.domain]:5801 [slingshot] Could not connect to: share_n2.domain/10.10.10.11:5801. Reason: ConnectException[Connection refused]
Jul 28, 2019 11:45:37 AM com.hazelcast.impl.TcpIpJoiner
INFO: [share_n1.domain]:5801 [slingshot]

Members [1] {
        Member [share_n1.domain]:5801 this
}

Jul 28, 2019 11:45:37 AM com.hazelcast.impl.LifecycleServiceImpl
INFO: [share_n1.domain]:5801 [slingshot] Address[share_n1.domain]:5801 is STARTED
2019-07-28 11:45:37,164  INFO  [web.site.ClusterTopicService] [localhost-startStop-1] Init complete for Hazelcast cluster - listening on topic: share_hz_test

 

Then starting a second node of the Share Cluster will display the following (still on the node1 logs):

Jul 28, 2019 11:48:31 AM com.hazelcast.nio.SocketAcceptor
INFO: [share_n1.domain]:5801 [slingshot] 5801 is accepting socket connection from /10.10.10.11:34191
Jul 28, 2019 11:48:31 AM com.hazelcast.nio.ConnectionManager
INFO: [share_n1.domain]:5801 [slingshot] 5801 accepted socket connection from /10.10.10.11:34191
Jul 28, 2019 11:48:38 AM com.hazelcast.cluster.ClusterManager
INFO: [share_n1.domain]:5801 [slingshot]

Members [2] {
        Member [share_n1.domain]:5801 this
        Member [share_n2.domain]:5801
}

 

Cet article Alfresco Clustering – Share est apparu en premier sur Blog dbi services.

Windows Docker containers, when plateform matters

$
0
0

A couple of days ago, I got a question from a customer about an issue he ran into when trying to spin up a container on Windows.

The context was as follows:

> docker container run hello-world:nanoserver
Unable to find image 'hello-world:nanoserver' locally
nanoserver: Pulling from library/hello-world
C:\Program Files\Docker\docker.exe: no matching manifest for windows/amd64 10.0.14393 in the manifest list entries.
See 'C:\Program Files\Docker\docker.exe run --help'.

 

I thought that was very interesting because it pointed out some considerations about Docker image architecture design. First, we must bear in mind that containers and the underlying host share a single kernel by design and the container’s base image must match that of the host.

Let’s first begin with containers in a Linux world because it highlights the concept of Kernel sharing between different distros. In this demo, let’s say I’m running a Linux Ubuntu server 16.04 …

$ cat /etc/os-release | grep -i version
VERSION="16.04.6 LTS (Xenial Xerus)"
VERSION_ID="16.04"
VERSION_CODENAME=xenial

 

… and let’s say I want to run a container based on Centos 6.6 …

$ docker run --rm -ti centos:6.6 cat /etc/centos-release
Unable to find image 'centos:6.6' locally
6.6: Pulling from library/centos
5dd797628260: Pull complete
Digest: sha256:32b80b90ba17ed16e9fa3430a49f53ff6de0d4c76ad8631717a1373d5921fa26
Status: Downloaded newer image for centos:6.6
CentOS release 6.6 (Final)

 

You may wonder how it is possible to run different distros between the container and the host and what’s the magic behind the scene? In fact, both the container and the host share the same Linux kernel and even if CentOS 6.6 ships with a kernel version 2.6, while Ubuntu 16.04 ships with 4.4 we usually may upgrade the kernel since it’s backward compatible. The commands below demonstrate the centos container is using the same Kernel than the host.

$ uname -r
4.4.0-142-generic
$ docker run --rm -ti centos:6.6 uname -r
4.4.0-142-generic

 

Let’s say now my docker host is running on the x64 architecture. If we look at the Centos image supported architectures on Docker hub, we notice different ones:

From the output above, we may deduce it should exist a combination of different images and tags for each available architecture and the interesting point is how does Docker pull the correct one regarding my underlying architecture? This is where manifest lists come into play and allow multi-architecture images. A manifest list contains platform segregated references to a single-platform manifest entry. We may inspect a manifest list through the docker manifest command (still in experimental mode at the moment of writing this blog post).

For example, if I want to get a list of manifests and their corresponding architectures for the Centos 7, I can run docker manifest command as follows:

$ docker manifest inspect centos:7 --verbose
[
        {
                "Ref": "docker.io/library/centos:7@sha256:ca58fe458b8d94bc6e3072f1cfbd334855858e05e1fd633aa07cf7f82b048e66",
                "Descriptor": {
                        "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
                        "digest": "sha256:ca58fe458b8d94bc6e3072f1cfbd334855858e05e1fd633aa07cf7f82b048e66",
                        "size": 529,
                        "platform": {
                                "architecture": "amd64",
                                "os": "linux"
                        }
                },
                "SchemaV2Manifest": {
                        "schemaVersion": 2,
                        "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
                        "config": {
                                "mediaType": "application/vnd.docker.container.image.v1+json",
                                "size": 2182,
                                "digest": "sha256:9f38484d220fa527b1fb19747638497179500a1bed8bf0498eb788229229e6e1"
                        },
                        "layers": [
                                {
                                        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                                        "size": 75403831,
                                        "digest": "sha256:8ba884070f611d31cb2c42eddb691319dc9facf5e0ec67672fcfa135181ab3df"
                                }
                        ]
                }
        },
        {
                "Ref": "docker.io/library/centos:7@sha256:9fd67116449f225c6ef60d769b5219cf3daa831c5a0a6389bbdd7c952b7b352d",
                "Descriptor": {
                        "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
                        "digest": "sha256:9fd67116449f225c6ef60d769b5219cf3daa831c5a0a6389bbdd7c952b7b352d",
                        "size": 529,
                        "platform": {
                                "architecture": "arm",
                                "os": "linux",
                                "variant": "v7"
                        }
                },
                "SchemaV2Manifest": {
                        "schemaVersion": 2,
                        "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
                        "config": {
                                "mediaType": "application/vnd.docker.container.image.v1+json",
                                "size": 2181,
                                "digest": "sha256:8c52f2d0416faa8009082cf3ebdea85b3bc1314d97925342be83bc9169178efe"
                        },
                        "layers": [
                                {
                                        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                                        "size": 70029389,
                                        "digest": "sha256:193bcbf05ff9ae85ac1a58cacd9c07f8f4297dc648808c347cceb3797ae603af"
                                }
                        ]
                }
        },
        {
                "Ref": "docker.io/library/centos:7@sha256:f25f24daae92b5b5fe75bc0d5d9a3d2145906290f25aa434c43bfcefecd10dec",
                "Descriptor": {
                        "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
                        "digest": "sha256:f25f24daae92b5b5fe75bc0d5d9a3d2145906290f25aa434c43bfcefecd10dec",
                        "size": 529,
                        "platform": {
                                "architecture": "arm64",
                                "os": "linux",
                                "variant": "v8"
                        }
                },
                "SchemaV2Manifest": {
                        "schemaVersion": 2,
                        "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
                        "config": {
                                "mediaType": "application/vnd.docker.container.image.v1+json",
                                "size": 2183,
                                "digest": "sha256:7a51de8a65d533b6706fbd63beea13610e5486e49141610e553a3e784c133a37"
                        },
                        "layers": [
                                {
                                        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                                        "size": 74163767,
                                        "digest": "sha256:90c48ff53512085fb5adaf9bff8f1999a39ce5e5b897f5dfe333555eb27547a7"
                                }
                        ]
                }
        },
        {
                "Ref": "docker.io/library/centos:7@sha256:1f832b4e3b9ddf67fd77831cdfb591ce5e968548a01581672e5f6b32ce1212fe",
                "Descriptor": {
                        "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
                        "digest": "sha256:1f832b4e3b9ddf67fd77831cdfb591ce5e968548a01581672e5f6b32ce1212fe",
                        "size": 529,
                        "platform": {
                                "architecture": "386",
                                "os": "linux"
                        }
                },
                "SchemaV2Manifest": {
                        "schemaVersion": 2,
                        "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
                        "config": {
                                "mediaType": "application/vnd.docker.container.image.v1+json",
                                "size": 2337,
                                "digest": "sha256:fe70670fcbec5e3b3081c6800cb531002474c36563689b450d678a34a89b62c3"
                        },
                        "layers": [
                                {
                                        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                                        "size": 75654099,
                                        "digest": "sha256:39016a8400a36ce04799adba71f8678ae257d9d8dba638d81b8c5755f01fe213"
                                }
                        ]
                }
        },
        {
                "Ref": "docker.io/library/centos:7@sha256:2d9b27e9c89d511a58873254d86ecf96df0f599daae3d555d896fee9f49fedf4",
                "Descriptor": {
                        "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
                        "digest": "sha256:2d9b27e9c89d511a58873254d86ecf96df0f599daae3d555d896fee9f49fedf4",
                        "size": 529,
                        "platform": {
                                "architecture": "ppc64le",
                                "os": "linux"
                        }
                },
                "SchemaV2Manifest": {
                        "schemaVersion": 2,
                        "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
                        "config": {
                                "mediaType": "application/vnd.docker.container.image.v1+json",
                                "size": 2185,
                                "digest": "sha256:c9744f4afb966c58d227eb6ba03ab9885925f9e3314edd01d0e75481bf1c937d"
                        },
                        "layers": [
                                {
                                        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                                        "size": 76787221,
                                        "digest": "sha256:deab1c539926c1ca990d5d025c6b37c649bbba025883d4b209e3b52b8fdf514a"
                                }
                        ]
                }
        }
]

 

Each manifest entry contains different information including the image signature digest, the operating system and the supported architecture. Let’s pull the Centos:7 image:

$ docker pull centos:7
7: Pulling from library/centos
8ba884070f61: Pull complete
Digest: sha256:a799dd8a2ded4a83484bbae769d97655392b3f86533ceb7dd96bbac929809f3c
Status: Downloaded newer image for centos:7
docker.io/library/centos:7

 

Let’s have a look at the unique identifier of the centos:7 image:

$ docker inspect --format='{{.Id}}' centos:7sha256:9f38484d220fa527b1fb19747638497179500a1bed8bf0498eb788229229e6e1

 

It corresponds to the SchemaV2Manifest digest value of the manifest entry related to the x64 architecture (please refer to the docker manifest inspect output above). Another official way to query manifest list and architecture is to go through the mplatform/mquery container as follows:

$ docker run mplatform/mquery centos:7
Image: centos:7
 * Manifest List: Yes
 * Supported platforms:
   - linux/amd64
   - linux/arm/v7
   - linux/arm64
   - linux/386
   - linux/ppc64le

 

However, for a Linux Centos 6.6 image (used in my first demo) the architecture support seems to be limited to  the x64 architecture:

$ docker run mplatform/mquery centos:6.6
Image: centos:6.6
 * Manifest List: Yes
 * Supported platforms:
   - linux/amd64

 

Now we are aware of manifest lists and multi-architecture images let’s go back to the initial problem. The customer ran into an platform compatibility issue when trying to spin-up a the hello-world:nanoserver container on a Windows Server 2016 Docker host. As a reminder, the error message was:

no matching manifest for windows/amd64 10.0.14393 in the manifest list entries.

In the way, that may be surprising because Windows host and containers also share a single Kernel. That’s true and it was the root cause of my customer’s issue by the way. The image he wanted to pull supports only the following Windows architecture (queried from the manifest list):

> docker run mplatform/mquery hello-world:nanoserver
Image: hello-world:nanoserver
 * Manifest List: Yes
 * Supported platforms:
   - windows/amd64:10.0.17134.885
   - windows/amd64:10.0.17763.615

 

You may notice several supported Windows platforms but with different operating system versions. Let’s have look at the Docker host version in the context of my customer:

> [System.Environment]::OSVersion.Version
Major  Minor  Build  Revision
-----  -----  -----  --------
10     0      14393  0

 

The tricky part is Windows Server 2016 comes with different branches – 1607/1709 and 1803 – which aren’t technically all the same Windows Server version. Each branch comes with a different build number. Referring to the Microsoft documentation when the build number (3rd column) is changing a new operating system version is published. What it means in that case is the OS version between the Windows Docker host and the Docker image we tried to pull are different hence we experienced this compatibility issue. However let’s precise that images and containers may run with newer versions on the host side but the opposite is not true obviously. You can refer to the same Microsoft link to get a picture of Windows container and host compatibility. 

How to fix this issue? Well, we may go two ways here. The first one consists in re-installing a Docker host platform compatible with the corresponding image. The second one consists in using an image compatible with the current architecture and referring to the hello-world image tags we have one. We may check the architecture compatibility by query the manifest file list as follows:

> docker run mplatform/mquery hello-world:nanoserver-sac2016
Image: hello-world:nanoserver-sac2016
 * Manifest List: Yes
 * Supported platforms:
   - windows/amd64:10.0.14393.2551

 

Let’s try to pull the image with the nanoserver-sac2016 tag:

> docker pull hello-world:nanoserver-sac2016
nanoserver-sac2016: Pulling from library/hello-world
bce2fbc256ea: Already exists
6f2071dcd729: Pull complete
909cdbafc9e1: Pull complete
a43e426cc5c9: Pull complete
Digest: sha256:878fd913010d26613319ec7cc83b400cb92113c314da324681d9fecfb5082edc
Status: Downloaded newer image for hello-world:nanoserver-sac2016
docker.io/library/hello-world:nanoserver-sac2016

 

Here we go!

See you!

 

 

 

 

 

 

Cet article Windows Docker containers, when plateform matters est apparu en premier sur Blog dbi services.

Viewing all 2844 articles
Browse latest View live